Reverse Engineering Android Apps Join the dark side :)
Reverse engineering in general is a tricky business and sometimes not very orthodox. So, why bother to write this article?
Well, sometimes reverse engineering is also for something good. It started when my wife dusted off her watch. We had a huge unpleasant surprise when we found that the companion app is not available anymore on Google Play! The watch is completely useless without the companion app, as you can’t even set the time on it… Because I hate to throw away a perfectly working watch I decided to create an app for it myself.
My first instinct was to find an older phone with the app still alive and to use a BLE sniffer to reverse engineer the BLE protocol. But I didn’t find the application installed on any old phones. I found the application online but the application cannot be used anymore as it was using some online services which are offline now…
Next obvious step was to decompile the application to get the communication protocol and also the algorithms behind the sleep & activities. This is how our story begins ;-).
Long story short
Decompiling Android apps is not that complicated. It takes time (a LOT of time), but it’s fun and rewarding.
Just don’t believe all those movies where someone decompiles an app and understands all the logic behind it in seconds :).
Tools I used to decompile:
- tintinweb.vscode-decompiler, a VSCode extension (https://marketplace.visualstudio.com/items?itemName=tintinweb.vscode-decompiler). Using this extension is quite easy to decompile an apk, just right click on it and the magic will happen in a few seconds. The only problem I found is that it doesn’t do any de-obfuscation (or at least I didn’t setup it correctly).
- Dex to Java decompiler (https://github.com/skylot/jadx). I found it better than vscode-decompiler as it has semi de-obfuscation. You’ll never get the original namings, but you get unique names instead of a, b, c, etc.
- NSA’s ghidra (https://github.com/NationalSecurityAgency/ghidra). Apart from a lot of java code, this application has all the logic into native C++. I used ghidra for decompiling the native (C++) stuff. It has java decompiler as well, but is not as good as jadx.
Short story long
I chose an older version as the last one had support for too many watches which I didn’t care about (at least not now). Android APKs supports (partial) obfuscation, which makes the decompilation not that straight forward, in some cases it’s actually pretty complicated. What does obfuscation do? It renames all packages to: a, b, c, etc. then all classes from each packages to: a, b, c, etc., then all members of each class to: a, b, c, etc., then all fields to (yes, you guessed right): a, b, c, etc. This means you’ll end up with loads of a classes, member functions and fields.
Some times it is easy to guess what the fields are, e.g.:
jSONObject.put("result", (int) this.c.a); jSONObject.put("fileHandle", this.c.c); jSONObject.put("newSizeWritten", this.c.d);
But in some cases you need to do lot of detective work.
As I mentioned in the beginning, I used 3 tools: vscode-decompiler extenstion, jadx and ghidra:
I started with vscode-decompiler, hoping that githubs copilot will help me in the process. It turned out to be completely useless for such tasks. When I imported the decompiled stuff into AndriodStudio, due to obfuscation, 90% of the classes had problems. Because there are dozens of classes with the same name (i.e. “a“, “b“), imagine how many conflicts you get.
Next was to use jadx to decompile the application, which supported semi de-obfuscation. I could import the project into AndroidStudio. Now, all the obfuscated classes have unique names (e.g. C1189f), which makes the AndroidStudio happier.
Just to be crystal clear, you cannot recompile the application and run it, unless the application is simple enough! After a few hours of guessing the name of the classes and their fields, I finally found what I was looking for: the BLE protocol! To my surprise, it has so many commands. I quickly cleaned out a few BLE commands that I was interested in:
- start/stop the sync animation on the watch
- set/get the time
I used bluetoothctl to quickly try the start/stopAnimation BLE commands, it worked perfectly.
The application has all the sleep & activities logic written in C++, so I had also to decompile the native part as well. For this job, I used ghidra with https://github.com/extremecoders-re/ghidra-jni extension. Ghidra is a fantastic tool. I tried a few more tools: radare2/rizin, binary ninja (the free online version), but, personally, I found ghidra the one most rich in features. The C++ compiled code is obfuscated “by design” due to various optimizations done by C/C++ compilers and it’s far, FAR harder to decompile than java. A long time ago I did a lot of binary decompilation and most of the time when I was trying to generate any C/C++ code from a binary, it resulted in pure garbage. I was amazed at how good ghidra’s C/C++ decompilation is.
Just to be clear, it requires a *LOT* of time to clean the code, to define all the structures, to do all the connections between them and to un-flatten all the STL stuff (here, some STL internals knowledge is needed), but the experience was better than I ever dreamt. Even if we can guess what it does from the function name, let’s take a very simple example to see what the C++ decompilation looks like and how verbose STL can be:
undefined Java_package_class_name_ActivityVect_1add (JNIEnv *env,jclass thiz,jlong a0,jobject a1,jlong a2,jobject a3) { undefined uVar1; undefined8 *puVar2; undefined8 uVar3; undefined8 *puVar4; if (a2 == 0) { uVar1 = FUN_0015d3ac((long *)env,7, "std::vector< Activity >::value_type const & reference is null"); return uVar1; } puVar4 = *(undefined8 **)(a0 + 8); puVar2 = *(undefined8 **)(a0 + 0x10); if (puVar4 != puVar2) { if (puVar4 != (undefined8 *)0x0) { uVar3 = *(undefined8 *)(a2 + 8); *puVar4 = *(undefined8 *)a2; puVar4[1] = uVar3; uVar3 = *(undefined8 *)(a2 + 0x18); puVar4[2] = *(undefined8 *)(a2 + 0x10); puVar4[3] = uVar3; uVar3 = *(undefined8 *)(a2 + 0x28); puVar4[4] = *(undefined8 *)(a2 + 0x20); puVar4[5] = uVar3; uVar3 = *(undefined8 *)(a2 + 0x38); puVar4[6] = *(undefined8 *)(a2 + 0x30); puVar4[7] = uVar3; puVar2 = *(undefined8 **)(a2 + 0x40); puVar4[8] = puVar2; } *(undefined8 **)(a0 + 8) = puVar4 + 9; return (char)puVar2; } std::vector<>::_M_emplace_back_aux<>((vector<> *)a0,(Activity *)a2); return (char)a0; }
All right, the decompiled code is pretty cryptic and it doesn’t tell us too much. Now, let’s see if we can make it better:
- first we need to define the Activity structure. I was lucky as I knew all the structure fields because they were set by the Java code via JNI in other places ;-).
- next is to define the std::vector structure, every single std::vector defines 3 fields:
T *__begin_; T *__end_; T *__enc_cap_;
Yes, that’s all a std::vector needs to do all the magic: iterate, insert, push, pop, erase, etc.
- last but not least use them in our function:
undefined Java_package_class_name_ActivityVect_1add (JNIEnv *env,jclass thiz,vector<> *vec_ptr,jobject a1,Activity *value,jobject a3) { undefined uVar1; Activity *end_cap; uint64_t uVar2; Activity *end; if (value == (Activity *)0x0) { uVar1 = FUN_0015d3ac((long *)env,7, "std::vector< Activity >::value_type const & reference is null"); return uVar1; } end = vec_ptr->__end_; end_cap = vec_ptr->__end_cap_; if (end != end_cap) { if (end != (Activity *)0x0) { uVar2 = value->endTime; end->startTime = value->startTime; end->endTime = uVar2; uVar2 = value->bipedalCount; end->point = value->point; end->bipedalCount = uVar2; uVar2 = value->maxVariance; end->variance = value->variance; end->maxVariance = uVar2; uVar2 = value->doubleTapCount; end->trippleTapCount = value->trippleTapCount; end->doubleTapCount = uVar2; end_cap = *(Activity **)&value->tag; *(Activity **)&end->tag = end_cap; } vec_ptr->__end_ = end + 1; return (char)end_cap; } std::vector<>::_M_emplace_back_aux<>(vec_ptr,value); return (char)vec_ptr; }
Okay, so now the code is much cleaner and we figure out exactly what this function does. This means we can write it in a single line of code:
vec.push_back(val);
Of course, you’ll find more, MUCH more complicated cases where you’ll spend a lot of time to figure out what’s going on.
I really hope some day the AI will be intelligent enough to do this job for us. Yes, I’m one of these people that is not afraid to embrace the new technologies :).
Side note, even though ghidra has an excellent C/C++ decompilation, good ASM knowledge will help a lot where ghidra fails to decompile to C/C++.
After I had enough info about BLE protocol, I began to write a Qt application to use it. I found the BLE support in Qt 6.5.1 quite good (at least on android & linux desktop) as I could use quite a few BLE commands painlessly.
The application is still at the beginning and it will require more time, pain and sorrow to get it at the same level of the original application, but it’s a start ;-).
Thank you for your time.
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
Interesting!
Keep us informed 🙂