Thursday, 29 August 2019

Hello, Dreamboat. Shall we chat?

The story goes like, I got a gig to work on a bot that should work with Artstation.com website. The requirement was that there must be some automated way to send messages to users of ArtStation. So I used AppleScript to interface with Safari and JavaScript to bridge between AppleScript and the web page. The bot or rather the script will get a list of user's profile link from a local file and then load them in Safari. For the initial launch if the sender is not signed-in, the bot will sign in with the sender credentials loaded from a creds file. Then the bot will invoke a couple of JavaScript bridging methods to simulate click on the buttons, add text in the message field, trigger change event so that AngularJS recomputes the constraints and enables the submit button. Then it invokes submit button tap which will send the message to the user, and finally closes the tab and if all tasks were done, closes the browser.

Everything is cool right? Not quite so, because now the requirement changes and we need to have some intelligent behavior for the bot, plus there won't be any list provided. And the bot has to figure out which message should be sent to which category of user and has to get all the users that the website has. So basically the bot is a crawler plus an intelligent messenger. Okay, so let's add some intelligent behaviour and spawn a couple of crawlers. When it comes to AI, true randomness marks the height of intelligence ;) So with that in mind, I went with a full fledged re-architecture.

The app is a Cocoa app which now uses a web view to do the messaging part. This gives more fine grained control over the web page loading events and such. The JavaScript communicates to the native code using webkit message handler. There isn't much for the UI for the app as the main focus was on the functionality. I added some screens to view the details of the crawler, messenger and to configure messages and sender credential, which gets persisted to the database.

Unfortunately there is no developer API the website provides. Life would have been much easier otherwise. I did some debugging of the JavaScript the site loads and figured how their API service works. Just for the kicks I wrote the crawler and frontier services in Objective-C. The crawler first gets the Anti-CSRF token so that the request can get through the CloudFront security validation. Without the CSRF token, we will get a captcha which is hard to solve for my bot ;) Then the bot calls the user v2 API which uses the same params the website uses to get the list of users which returns data in JSON format. The JSON data reflects their model layer, presumably that of a NoSQL DB.

Now that we have the users list by category, we need to persist the data. And there are probably close to a million users, so I need a DB that scales well. So I went with FoundationDB with the Document Layer. As a matter of fact, FoundationDB powers iCloud, and the DB is distributed, fault tolerant, scalable architecture is very promising. All of the DB setup went really well. Now I needed a MongoDB driver to talk the the document layer. So I used the official MongoSwift library, but now Swift Package Manager refuses to work because it sensed the presence of Objective-C code. After wrestling with SPM and Xcode and MongoSwift, I just wrote the FoundationDBSevice, which is the persistence layer in Objective-C again so that I can call directly into the Mongo C driver to work with the DB. Less pain. If I had used only Swift, working the the SPM would have been a breeze and could have used MongoSwift readily. Nevertheless, the bot now crawls the website saving users based on category to the local FDB. To not DDoS their website, I used the GKGaussianDistribution that comes with GameplayKit to generate a random number within a specified set of mean and standard deviation values, and uses this value along with current time to schedule the crawl. Same logic is used for the messenger as well, but with different set of mean and SD values. The bot saves the state of each crawl so that the next time it starts, it can crawl (fetch) users from where it left off. Users are messaged only once for a category. If a user belongs to multiple categories she will get different set of messages relevant to that category. The sender details can added from the settings which are persisted in the DB except the password, which is stored in the macOS keychain.

We can set a message template for each category and the bot will interpolate the string before send the message to each user. Now the bot says Hi to ArtStation users.



Check out the source code at Github and let me know what you think.

No comments:

Post a comment