Monday, April 12, 2021

Recovering Deleted Data in an Excel Tab (When you have a PivotTable)


 Life happens sometimes. Is this you?

  1. You leave multiple programs running because you have a few meetings to get through for the day.
  2. You hope to get back to that spreadsheet you started working on in the early morning.  
  3. You operate Excel pretty well using the keyboard, and touch type in it and other programs.
  4. Eventually, you get to the next day, discover you never got back to the spreadsheet, and so close and save it promising you will finish it up later that day.
  5. You reopen it later and find out the tab you were working on was missing.
  6. Because you downloaded it and never turned on "version control", you don't have the old version.
  7. Or a copy of what you worked on (and the changes are to recently made to have a backup).
  8. But it DOES have a pivot table summarizing the data in your missing tab.
  9. Somewhere during your calls you hit both Delete and Enter at the same time and managed to delete the Excel sheet b/c you were typing and talking at the same time, and paying attention to the Zoom call, and didn't see that you'd deleted the tab you just spend hours working through.
This post may save your data, but only if you are a bit of a computer geek and know a little something about XML. Using the techniques below managed to save mine.

What you may not know is that:
  • Later version of Microsoft Office tools, including Excel use the ZIP format to store the collection of XML files they use to store data in the new formats.  
  • Whenever you create a Pivot Table in Excel from data in your sheet (or from other data sources), Excel caches a snapshot of the data for local processing, and only refreshes from the original source when you Refresh the Pivot Table.  If you DON'T Refresh the Data in the Pivot Table, it's still in your spreadsheet.
Here is what you need to do:

  1. Make a copy of the affected file in a new work folder.
  2. If the copied file is in .XLS format, open it in Excel, and Save as a .XLSX.  This will simply change the file format, the data you need will still be in it, but now in a format that can be more readily accessed.
  3. Rename the file from *.XLSX to *.ZIP.
Next, look for xl/pivotCache/pivotCacheDefinition1.xml and xl/pivotCache/pivotCacheRecords1.xml in the ZIP file.  If you have more than one Pivot Table, you might need to look at the files ending in a different number.

The pivotCacheDefinition file contains the column heading names in the cacheField element.
You can verify the data source in the worksheetSource element.
The pivotCacheRecords file contains the rows of the sheet in the <r> elements.
Empty cells are reported in <m/> elements.
The values are found in the v attribute of elements named <b> (boolean), <d> (date), <e> (error), <n> (numeric) and <s> (string).
Some elements where the values are repeated a lot use <x>, in which case the v attribute indicates the index of the sharedItems found in the cacheField element in the pivotCacheDefinition file.  This gets a little bit complicated.

Assuming you've extracted those two files, the following XSLT will regenerate the sheet in CSV format when run over the extracted pivotCacheRecords.xml file.


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    xmlns:m="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
    version="2.0">
    <xsl:output  method="text"/>
    <xsl:variable name="doc" select="doc('pivotCacheDefinition1.xml')"/>
    <xsl:template match='/'>
        <xsl:for-each select='$doc//*:cacheField/@name'>
            <xsl:if test='position()!=1'>,</xsl:if>
            <xsl:value-of select='.'/>
        </xsl:for-each>
        <xsl:text>&#xA;</xsl:text>
        <xsl:for-each select='.//m:r'>
            <xsl:for-each select='*'>
                <xsl:if test='position()!=1'>,</xsl:if>
                <xsl:variable name="pos" select="position()"/>
                <xsl:choose>
                    <xsl:when test="self::m:m"></xsl:when>
                    <xsl:when test="self::m:x">
                        <xsl:variable name="v" select="@v + 1"/>
                        <xsl:variable name="x" select="$doc//m:cacheField[$pos]/m:sharedItems/*[$v]/@v"/>
                        <xsl:if test='$x'>"</xsl:if>
                        <xsl:value-of select="replace($x,'&quot;','&quot;&quot;')"/>
                        <xsl:if test='$x'>"</xsl:if>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:text>"</xsl:text>
                        <xsl:value-of select="replace(@v,'&quot;','&quot;&quot;')"/>
                        <xsl:text>"</xsl:text>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each>
            <xsl:text>&#xA;</xsl:text>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>



0 comments:

Post a Comment