Friday, September 9, 2011

Hacking my data

This post was spurred on by @NateOsit's comments about QS apps earlier today.  The conversation ended with this tweet (you can see the whole conversation in reverse order by clicking on the time/date of the prior tweet that each was in reply to).

One of my recent acquisitions was an iPad 2.  I purchased it from proceeds of my advance for The CDA Book.  I also got another toy, an iHealth Blood Pressure Monitor (I'm at risk for hypertension), and an app that lets let track my weight. These are both problems my father suffered from, and in part they contributed to his death, so I'm paying attention to them.  And, because like my father, I also have white coat syndrome, my BP is higher in the office than the rest of the time.  Having the BP monitor and a record to show my doctor has kept me off BP meds thus far.  But it's also made me more aware of my blood pressure.

One of the things that I don't like about the iPad, or most of the apps for it, is how difficult it is to get data out of them.  But then I ran across this article by Adam Crosby.  It explains how to find the files that are on my PC that correspond to files on my iPad.  It also suggests that many of them are easy to hack.  So, I went digging.

If you have an iPad and a Windows PC (like me), you can find the backup folder in a place something like C:\Documents and Settings\YourUserName\Application Data\Apple Computer\MobileSync\Backup\SomeLongHexString

That folder contains a file for every data store on your device.  For me, that folder contained more than 100,000 files, and took about 30 seconds just to list the directory.  Almost all of the files are named using a hexadecimal string.  Trying to find your data in all of this for any given app would seem to be a nightmare, but it really isn't that hard.  It took me about an hour.  Most of that was waiting time.
  1. Close all of your iPad apps, or shut down and restart your iPad.  This step makes sure that only files you want are changed.
  2. Sync your iPad with your PC
  3. Make a copy of your backup folder somewhere else.
  4. Twiddle your thumbs, or catch up on your RSS Feed while the disk churns.
  5. Open the one app whose data files you want to find.
  6. Change some data in it, and close the app.  Know what data you had and what you changed it to.  This could be important later when trying to decode the data.
  7. Resync your iPad.
  8. Make a copy of your backup folder again, in yet another location.
  9. More thumb twiddling.
  10. Using WinDIFF or other directory comparison tool, locate the files that have changed.
  11. Reread war and peace.  This takes a while. (Don't really worry about the time, it took me longer to write this post than this whole process, and that was less than an hour).
  12. Copy those files (there should only be a few) to another folder.
  13. Do it again!  There's a reason for this, because you are going to modify some of them.
  14. Now, go find plutil.exe on your computer.  I found it in C:\Program Files\Common Files\Apple\Apple Application Support.  This is a utility that lets you unpack binary PLIST files.  PLIST is an XML format for property lists, somewhat like the Windows Registry, but in an XML format (I mentioned it briefly in this post).  Binary PLIST is a compressed format that the plutil application can make readable for you.
  15. Next, download a tool that will allow you to access SQLite databases.  This one worked just fine for me.
  16. Now, dig through the files that you found that changed.
    1. Some of them will likely be in PLIST format.
    2. Others will be in binary PLIST format.  These can be detected by the presence of "bplist" at the type of the file.
    3. Others might be in SQLite format.  These can be detected by the presence of "SQLite format" followed by a version number.
The files in PLIST format you can just decode yourself.
The files in BPLIST format you need to unpack using plutil.  Assuming you've added PLUTIL.EXE to your path, the command to "unpack" is PLUTIL -convert xml1 FILENAME
The files in SQLite format you will need to export to a CSV or other file format.

Once you've found your files, remember those nasty hex strings for them.  They'll be in the same place the next time.  So now you can start dumping data.

For my iHealth, the data file I wanted was in PLIST format.  It looked something like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
 ...
<string>34</string>
<string>07-07-2011</string>
<string>08:10</string>
<string>117</string>
<string>79</string>
<string>65</string>
<string>6</string>
<string>1</string>
<string>0</string>
<string>0</string>
...
</array>
</plist>

The format is a big array of strings.  Each measurement has 10 strings.  The first is the measurement number.  The next two are date and time.  The next is systolic, followed by diastolic, then heart rate.  I haven't figured out the next four, but they don't much matter to me yet.  I can create an XSLT to reformat this into an HTML table, which I can then import into a spreadsheet and start graphing.

For the weight measuring app, it tracks four numbers in a SQLite database.  The userid (it can be used with multiple users), the date (recorded as the number of days since January 1, 2000), the weight, and something they call my "true weight" which appears to be a formula used to generate some sort of weighted moving average.  My weight is clearly recorded in grams (the app displays weight in lbs, stone or kg).

So, now I need a command line tool which will automate the export of the SQLite table file.  Another quick web search finds this collection of utilities.  The shell is what I want.  I can easily script that to load the file and dump its contents.

So now I can hack my data.  While I hate the iPad's closed architecture, one thing I do love is how easy it is to hack the data from it.  I think I'll turn this into an app to generate a table of values in HTML 5.  Then I'll add microdata tags based on my CDA R3 proposal.  Next I'll write some Javascript to plot the data  on a canvas with some pretty colors by getting at the raw figures in the Microdata.

If I can find a browser that supports the microdata DOM, I could have this app ready in time for next week's HL7 Working Group meeting.


2 comments:

  1. Glad to see that Nate has inspired yet another great blog post!

    ReplyDelete
  2. Wow! Looks like there is a serious risk of EHRs being hacked through social apps...am I right?

    ReplyDelete