Placeholder Image

字幕表 動画を再生する

  • I'm Alexandra.

  • I'm a software engineer at Bella, we're making a product that acts as copilot for a manager,

  • one on ones and feedback.

  • If that's useful, check out bella.app.

  • And what I was to talk about today is something completely different in my spare time.

  • It's transferring all the data you need other SMS.

  • Why would I build something like this when we have data on Wi Fi connections all over

  • the world?

  • I come from a country where a data plan for 2 gigabytes a month costs 55 Euros and only

  • works in my hometown.

  • Three quarters of a million people are using dialup Internet in their homes.

  • I come from Canada.

  • So, the Internet is so, so expensive just when I'm at home, that when I'm traveling

  • somewhere, to come to Berlin, the prices are out my price range.

  • I like to travel, this data is an issue.

  • When I travel, I love to visit Paris and the streets look like this.

  • I get lost really easily in a grid like structure.

  • When I'm over here, I need data to get around.

  • And I could technically download a map offline and use that.

  • But that doesn't give me transit directions which is something I really need when I walk

  • an hour in the wrong direction.

  • I need a subway to get back home.

  • When I was trying to come up with a solution for this problem, I noticed like I can't afford

  • the data plans.

  • But SMS without a data plan costs about 15 cents per message.

  • I tried to work around that.

  • When I was starting a solution to this problem, chatbots were the big thing at the time.

  • I set up this problem, setting up a Python server with SMS and grabbing it from the Internet

  • and grabbing the directions and text it back to me.

  • I could do something simple, how do I get from point A to B and get the Google Map directions

  • back in one SMS.

  • I could do this for 30 cents per direction.

  • Which is pretty good compared to what I would be paying for data otherwise.

  • And this really worked, and I used it lot when I was traveling.

  • But the issue is when you get a little bit of access to the Internet, you start to crave

  • it quite a bit more.

  • So, I was building these one off integrations to figure out ratings for a restaurant I wanted

  • to see or a translated word I didn't know and do all this stuff.

  • And building out all these integrations as a one off took up a lot of time.

  • I thought, there must be a better way.

  • I'll just build a browser.

  • And that's what I'm gonna show you today.

  • So, your two main components to the project I did.

  • There's the Android app on one side and the server in NodeJS on the other.

  • The app I made in Android because I was it just for me.

  • I don't have an iPhone.

  • I didn't care about iOS.

  • And I'm using Java instead of Kotlin because there was a lot more on StackOverflow on SMS.

  • That was a good solution for me, I'm not an app developer.

  • And NodeJS, I thought it would be fun you to use JavaScript on a browser where it doesn't

  • belong to make this JavaScript less browser.

  • And then Twilio for the communication.

  • And Twilio does a lot of things, I bought a phone number from it and it let me set up

  • an end point to forward all any SMS to.

  • I can send a text message to my Twilio phone number and it will forward all those messages

  • to my server.

  • So, before I jump into the project, just to set up like a limitation I had to deal with,

  • SMS can only handle 160 characters at a time.

  • If I want to create a browser, I'm going to have to transmit this data less than one Tweet

  • at a time.

  • So, for more context on the issue.

  • Like this Google web page looks very, very small.

  • It's just a text box and the logo and a button.

  • But the thing is, if you actually look at the page source, not including any CSS, not

  • including the images or any sources being loaded in, this web page is a quarter of a

  • million characters long.

  • If you were to transmit this entire thing, 1300 SMS, not including the ones dropped along

  • the way.

  • In Twilio fees alone, I would have been paying $10 to transmit this Google page.

  • It defeats the purpose of being a cheap solution.

  • So, we have to do a lot on this page to get it to work.

  • But if we were to imagine what this Google page look would like building it up from scratch,

  • we wouldn't imagine all the CSS to make it load.

  • We would imagine this little bit of HTML that sets up the form and the text box which is

  • with a we need and takes one SMS.

  • This is how we envision a lot of the web pages to make this project work.

  • So, we're going to walk through the life cycle of a request and that's starting on the Android

  • side in the app.

  • Right off the bat we get into this huge limitation with systems because the URL spec says that

  • a URL can be 2,000 characters longitudinal.

  • So, that could take up 13 SMS which is a lot more than we want to deal with.

  • The first thing on the app side, and the app looks like this, a text box and a go button.

  • Very simple browser.

  • We're going to start off by chopping off everything that we don't really need.

  • So, the browser is going to be a very text based browser.

  • We're not going to allow any kind of cool single page applications.

  • We can chop off anything that has a pound symbol, page, whatever.

  • We don't want any tracking IDs or query parameters.

  • Everything in black after the URL there we can get rid of.

  • And same with the HTTPS, www, at the start of the URL because it's assumed that all websites

  • have that anyway.

  • So, the part in yellow is what we're going to be sending over as an SMS.

  • And that's gonna look something like this.

  • I'm not going to cover the Android side of things too much.

  • But make sure we have send, receive and write permissions on Android and use their simple

  • SMS manager API which lets you just specify the destination phone number which in this

  • case is our Twilio one.

  • Specify the text that you want to send and then it just gets sent off.

  • So, then Twilio picks up on message and converts it into a format that the server will read.

  • It comes with a bunch of metadata.

  • We're only going to care about the body, who it's to and from so we know who to send this

  • message back to.

  • So, this message gets sent over to the server.

  • Which we're gonna look at next.

  • So, a lot of us probably working in React or some kind of like componentizing library

  • or framework all day.

  • So, we kind of forget how big our HTML ends up really being because we're only dealing

  • with these little tiny components at once.

  • If like me and page source accidently instead of inspect element all the time you end up

  • with a massive wall of text.

  • And how on Earth are web pages this big?

  • This is what the Google source page looks like.

  • This is what we're going to have to deal with and parse before sending it back over for

  • a text.

  • But there are a lot of things that we can remove from this off the bat.

  • We don't care about comments, we don't care about header data.

  • We don't care about CSS or any that have stuff.

  • So, there's a lot of stuff we can take off pretty easily to get this to work.

  • On the server side, what we're gonna do is start by grabbing the URL that Twilio sent

  • us, making a request to that URL by pre penning the HTTPS to it and then use a library called

  • Cheerios which is jQuery for the server side pretty much.

  • And use that to get the body off of the HTML because jQuery makes that quite easy.

  • And then once we have this body, we're going to start to remove a bunch of the HTML.

  • For that, I used a library, sanitize HTML.

  • So, line four that I highlighted lets you specify specific tags in the HTML that you

  • want to allow.

  • So, here we're only going to allow anchor tags, inputs and forms.

  • Those are the only elements in my opinion that really provide any kind of value to the

  • user in a text based browser.

  • Lines 7 10 highlighted shows the specific attributes on those tags that we are gonna

  • allow.

  • So, we're not gonna allow class name because we don't have CSS anyway.

  • We're not gonna allow image tags because we can't really load images over SMS.

  • And we're going to start loading these things.

  • The last bit that's highlighted in the exclusive filter.

  • So, sanitize HTML lets us specify a function of the tags and attributes to decide whether

  • we want to show certain things.

  • So, one of the examples I have there is we're going to get rid of all of the hidden inputs

  • because the user can't see those anyways so it's not going to provide any value.

  • And we're gonna get rid of policy URLs and terms and conditions.

  • No one is going to click on those anyway so it's kind of wasted space.

  • Now we have essentially a whole bunch of text and a couple of tags left in the HTML.

  • We're going to start to compress this text.

  • And we could use something like Gzip.

  • But that's not fun.

  • So, we're gonna forget about any kind of real compression.

  • So, in the English language there are a lot of words that we use super often like the

  • and.

  • And a lot of single letters that are words, I and A. And any common English word can be

  • mapped to a single later that's not a word, everything that's the becomes T, and becomes

  • ampersand or whatever letter.

  • And on Android side, doing the decompression, we know if that letter is there on its own,

  • it means the because it's not a word otherwise.

  • And to do that, it's very, very simple.

  • We have to set up a dictionary mapping these words to their shorter versions and then go

  • through the text and do a replace all.

  • Another way of compressing the text is through the source APIs.

  • So, if we're visiting a website like Wikipedia, there are going to be a lot of big words that

  • don't need to be that big.

  • So, what we can do is find those very large words, use a thesaurus API and see if there's

  • a matching synonym that's much shorter.

  • Pen ten chair is many letters and jail is a four letter word, we don't care about the

  • word, do a replacement.

  • This is a 66% compression which is quite a good compression rate.

  • The last way we're going to compress the text in our HTML is by replacing links.

  • So, when you're using a website on your phone, you don't care what a link actually is.

  • You just care that it takes you to where it's supposed to take you.

  • The links can be really, really long.

  • Like up to 2,000 characters.

  • So, instead of sending over these links that no one is actually going to be reading, we're

  • gonna replace them with really short, random strings.

  • So, when a user clicks on a link on the app side, that short link is what's gonna be sent

  • back to the server.

  • The server's gonna know that short link means that long link and it's gonna fetch the correct

  • data to send back to us.

  • And that's gonna look like this.

  • I'm using Reddit to store the pairings of short to long URLs because we don't need anything

  • super persistent.

  • Most links are not going to be clicked on anyway and the web page is probably going

  • to be gone in five minutes.

  • What we're doing here is there's a function that takes the phone number from the user

  • and the actual URL that is in the web page and it's going to store it in Redis where

  • the key is the phone number and the short URL so it's really easy to retrieve when it's

  • sent back.

  • And then the value of that is the full URL.

  • Now, the last bit of compression we're going to do is on the HTML itself.

  • So, we've compressed all the English text, we've compressed the tags and attributes.

  • But we still have going to have a lot of large tags like this.

  • And things like input or type and name and value and all those tags that we do allow

  • are going to show up pretty often.

  • So, we can remove those and replace them.

  • A nice thing about the SMS characters is that it supports all of the English or all of the

  • English or letters and numbers and things that you see on an English keyboard plus the

  • whole Greek alphabet.

  • So, what we've done here since all the individual letters are already used by our English compression,

  • the text compression, we're going to start mapping specific HTML tags and combinations

  • of symbols to these Greek letters.

  • I tried to map it by color here, the open bracket input matches the first character,

  • type equals open quotation mark is the second character.

  • So on.

  • This brings it down from 44 characters to 12 characters which is going to provide a

  • significant compression.

  • So, now the HTML is ready to be sent out.

  • But an annoying thing about SMS is that there's no guarantee of delivery and there's no guarantee

  • that it's gonna be delivered in the right order either.

  • So, you might end up with a situation where we've sent out six SMS, but only four of them

  • get there and they're all out of order.

  • We're not going to really care about the part where the SMS are dropped in this project

  • because it's just a small project for me to access the Internet for fun.

  • And if we did worry about that, we would have to build out an entire pact delivery network

  • where you have to figure out which messages were dropped, how to recover from that and

  • that's just a little bit more effort.

  • But we are going to worry about this out of order problem.

  • And to solve that, the HTML that we have is going to be divided up into the 160 character

  • limit.

  • And we're going to pre pen some metadata to the start of each message that shows the total

  • number of messages in this web page and then the index where that SMS falls so that when

  • we put the HTML back together it's in the correct order.

  • And to send out those messages, we're just going use Twilio's library.

  • Very simple, Twilio, send message.

  • And it goes off and that's all I have to do.

  • Now the messages are sent and we're ready to start getting them again on the Android

  • side.

  • So, Android has a thing called a broadcast receiver which is something that listens out

  • for certain signals that are sent within your phone.

  • So, here we have set up a broadcast receiver that listens out for messages coming in specifically

  • from the number that we own on Twilio.

  • So, what this is doing is it's just listening for that message to come in.

  • And then it's grabbing the text from that message and sending it other to an activity

  • which is Android's way of saying a new you.

  • And in that new view, the first thing we're going to have to do is reverse that Greek

  • letter conversion that we did to get the proper HTML back.

  • We're going to reverse the shortened English words.

  • We're going to add some spaces between closing and opening tags so there's a little bit of