I was approached in late 2012 and asked about the possibility of creating a personal offline version of the Google Play Store (free apps only). This would essentially involve collecting every application from the Play Store and the related information like title, description and so on.
It’s been over 4 months since the project was completed and I’m finally able to share some details about it.
The project was split into 4 sections:
- Installing the apps to an Android device – to get access to the APK files (installation files)
- Extracting the APKs from the device
- Collecting information about each APK from the Play Store
- An offline front-end to access the applications
Grabbing the APK files
To get the APK files, you have to install the apps onto your android device via the Play Store. The APK file is downloaded, the app is installed and then the apk file is stored in /data/app, making it easy to grab the file. I’d originally envisaged this being the most difficult part of the task but it turned out to be relatively easy. We automated the Safari browser to go through the Play Store by category and page and install the applications to our devices. We ran multiple instances on virtualized machines. We started with 2 Huawei Blaze phones as a proof of concept and it quickly became apparent that the low processing power of these phones and only having 2 of them would hugely delay the project – so we swapped to 6 Galaxy S3 units (Korean model with 2GB RAM) which were much quicker and performed far more reliably.
Extracting the APK files
This was one of the easiest parts of the project. There are many FTPD and SSHD apps available in the Play Store. We installed an FTP server on each phone and created a script that connected to each phone periodically, downloaded the .apk files from /data/app and then deleted them.
Collect information about each APK
Scraping a lot of information from Google can be difficult without hitting their ‘We’re sorry but it appears your computer is sending automated requests’ warning. Fortunately we had access to a large number of relatively clean proxies and were able to put in a lot of random delays since it was always going to take far longer to download the apps to the phones than it would be to collect information about each application.
The client has asked that we don’t go into much detail about the front-end so all I can say is that the app data is stored in a database and matches up with the APK files.
Why create an offline copy of the Play store?
I couldn’t even imagine all the reasons why someone might want to create an offline copy but having been blocked from downloading apps because I’m not in X country or because it thinks the app isn’t compatible with my device, a copy of all those APK files grabbed from a Google friendly country with a highly popular and well supported phone would indeed be very useful for those in smaller countries and uses devices that aren’t marked as being supported in the play store.
This project turned out to be a lot easier than I’d ever imagined and the costs were fairly minimal. The biggest thing was having patience since there’s just an immense number of applications in the Play Store and more are added every day. Did we get every single last application? No, probably not even close. But did we get every application that anyone would likely want? I think so. Of course this archive will quickly become outdated so constant maintenance is required if you want a fairly updated system.
Overall it was an interesting project that I imagine will only become harder to replicate as the Play Store continues to evolve.
I’ve created a little video that shows the basic technology used in this project. This is actually a recent run recorded as a demonstration for this article and not live footage from the clients project. The video ends abruptly because we’re not able to show you the front-end.
No Comments yet »
Leave a comment