Friday, February 15, 2013

new MaxL "export outline" XML to delimited text parser - orders of magnitude faster

The Java code in my last post works fine for XML containing less than 70,000 members, which is usually sufficient for BSO cubes. ASO cubes, however, can contain a LOT more members. Dan Pressman contacted me earlier in the week to let me know that his XML dimension export file containing 4 million members in almost 90 million lines of XML was taking over 17 hours to run (when it wasn't crashing from running out of memory).

So I rewrote the code to make it more efficient. The same XML above with 90 million lines now takes about 3.5 minutes to run and only uses 75MB of RAM. The functionality is even improved, in that member attributes are now also output. A header line is also prepended, which, due to Java I/O, adds about a minute to the process.

I recommend using this new code for all files instead of the code in the previous post. You can find the two files needed in my GitHub essbase-parse-export-outline-xml-to-text repository.

I also created a very simple Java GUI that allows you to choose the input file, output file, and delimiter:

You can use the XMLOutlineParser.jar file from the above GitHub repository to launch the GUI. If you have a fairly recent Java jre or jdk installed (version 6 or higher), you should be able to save the jar file to your local drive, then double-click on it to launch the GUI.

P.S. As you can see in the comments, some people have experienced issues when trying to save the output file on Windows. The following steps are the workaround to this permissions issue:
  1. Go to Start -> Programs -> Accessories
  2. right click on Command Prompt and select "Run as administrator"
  3. cd to the directory where you saved essbase-parse-export-outline-xml-to-text
  4. run "java -jar XMLOutlineParser.jar"
You can avoid this issue entirely by not selecting an output location at the root of the C directory (e.g. C:\test). Rather use one such as "C:\Documents and Settings\harry\outputFile.txt".

Update: I've added the ability to pass in command line arguments, making XMLOutlineParser suitable for scripting in batch and shell scripts. Passing in arguments will prevent the GUI from running. 
  • run java -jar XMLOutlineParser.jar "input file: path & name" "output file: path & name" "field separator to use in output file"
  • -------e.g. java -jar XMLOutlineParser.jar "/Users/harry/year.xml" "/Users/harry/Documents/year.txt" "!"

Wednesday, January 30, 2013

Parse MaxL export outline xml to flat file - see next post for updated, faster code

The new (as of 11.1.2) MaxL "export outline" command has made using the Essbase Java API to export outlines obsolete. It is an order of magnitude faster and doesn't run into the typical Essbase japi issue with ports. The only problem is that it exports to XML format. The resulting file is more difficult to use in downstream processes than a delimited text file. It also can't be used to import back into Essbase. Luckily, however, it is a fairly simple matter to parse the XML to a flat file with a delimiter of your choice:
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.ErrorHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

/***
 * 
 * @author Harry Gates

Usage
 * 1) Use the MaxL "export outline" command to export a dimension of the outline
 *   export outline "/path_to_otl/Sample.otl" list dimensions {"Account"} with alias_table "Default" to xml_file "/path_to_save_xml/sample_account_dim.xml";
 * 2) Change the delimiter, inputFile, and outputFile
 * 3) Compile and run MaxLExportOutlineParseXML.java
 */

public class MaxLExportOutlineParseXML implements ErrorHandler {
 public static String delimiter;
 public static String inputFile;
 public static String outputFile;
 
 public static void main(String[] args) {
  delimiter = "?";
  inputFile = "/path_to_save_xml/sample_account_dim.xml";
  outputFile = "/path_to_save_result/sample_account_dim.txt";
  
  try {
   convertXML();
  } catch (Exception e) {
   e.printStackTrace();
  }
 }
 
  public static void convertXML() throws Exception {
    String output = "";
   
    DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
    builderFactory.setNamespaceAware(false);
    builderFactory.setValidating(false);
    builderFactory.setIgnoringElementContentWhitespace(true);
    DocumentBuilder builder = null;
    
    File file = new File(inputFile);
       
    InputStream inputStream= new FileInputStream(file);
    Reader reader = new InputStreamReader(inputStream);
     
    InputSource is = new InputSource(reader);
    is.setEncoding("UTF-8");
    
    builder = builderFactory.newDocumentBuilder();
    builder.setErrorHandler(new MaxLExportOutlineParseXML());
    Document xmlDoc = builder.parse(is);
    
    NodeList mbrNL = xmlDoc.getElementsByTagName("Member");
    
    NodeList dimNL = xmlDoc.getElementsByTagName("Dimension");
    String dimName = dimNL.item(0).getAttributes().getNamedItem("name").getNodeValue().replaceAll("\\r\\n|\\r|\\n", " ");
     
    String header = "PARENT0," + dimName + delimiter + "CHILD0," + dimName + delimiter + "ALIAS0," + dimName
     + delimiter + "PROPERTY0," + dimName + delimiter + "FORMULA0," + dimName + delimiter;
    int udaCountTotal = 0;
      
    BufferedWriter outputStream;
    outputStream = new BufferedWriter(new FileWriter(outputFile));
    
    if (mbrNL.getLength() > 0) {
        Node node = null;
  
  for (int i = 0; i < mbrNL.getLength(); i++) {
      node = mbrNL.item(i);
   
      if (node.hasChildNodes()) {
       String memberName = "";
       String parentName = "";
       String UDA = "";
       String alias = "";
       String dataStorage = "";
       String timeBalance = "";
       String memberFormula = "";
       String twoPassCalc = "";
       String consolidation = "";
       String varianceReporting = "";
       int udaCount = 0;
       
       NodeList list = node.getChildNodes();
       for (int j = 0; j < list.getLength(); j++) {
         if (list.item(j).getNodeName().equals("UDA")) {
          UDA += list.item(j).getTextContent().replaceAll("\\r\\n|\\r|\\n", " ") + delimiter;
          udaCount++;
         }
         if (list.item(j).getNodeName().equals("Alias")) {
          alias = list.item(j).getTextContent().replaceAll("\\r\\n|\\r|\\n", " ").replaceAll("\"", " ");
         }      
       }
       udaCountTotal = udaCount > udaCountTotal ? udaCount : udaCountTotal;
    
    NamedNodeMap nodAttrs = node.getAttributes();
       for (int k = 0; k < nodAttrs.getLength(); k++) {
           Attr nodAttr = (Attr)nodAttrs.item(k);
           String nodName = nodAttr.getNodeName();
           if (nodName.equals("name")) {
             parentName = node.getParentNode().getAttributes().getNamedItem("name").getNodeValue().replaceAll("\\r\\n|\\r|\\n", " ");
             memberName = node.getAttributes().getNamedItem("name").getNodeValue().replaceAll("\\r\\n|\\r|\\n", " ");
             parentName = parentName.replaceAll("\"", " ");
             memberName = memberName.replaceAll("\"", " ");
           } else if (nodName.equals("Consolidation")) {
            consolidation = nodAttr.getNodeValue().replaceAll("\"", " ");
            consolidation = consolidation.trim();
           } else if (nodName.equals("DataStorage")) {
            dataStorage = nodAttr.getNodeValue().replaceAll("\"", " ");
            if (dataStorage.equals("DynamicCalc")) {
             dataStorage = "X";
            } else if (dataStorage.equals("DynamicCalcAndStore")) {
             dataStorage = "V";
            } else if (dataStorage.equals("LabelOnly")) {
             dataStorage = "O";
            } else if (dataStorage.equals("NeverShare")){
             dataStorage = "N";
            } else if (dataStorage.equals("ShareData")){
             dataStorage = "";
            }
            dataStorage = dataStorage.trim();
           } else if (nodName.equals("MemberFormula")) {
            memberFormula = nodAttr.getNodeValue();
           } else if (nodName.equals("TwoPassCalc")) {
            twoPassCalc = nodAttr.getNodeValue().replaceAll("\"", " ");
            twoPassCalc = twoPassCalc.equals("Y") ? "T" : twoPassCalc;
            twoPassCalc = twoPassCalc.trim();
           } else if (nodName.equals("TimeBalance")) {
            timeBalance = nodAttr.getNodeValue().replaceAll("\"", " ");
            if (timeBalance.equals("Last")) {
             timeBalance = "L";
            } else if (timeBalance.equals("First")) {
             timeBalance = "F";
            } else if (timeBalance.equals("Average")) {
             timeBalance = "A";
            }
            timeBalance = timeBalance.trim();
           } else if (nodName.equals("VarianceReporting")) {
            varianceReporting = nodAttr.getNodeValue().replaceAll("\"", " ");
            varianceReporting = varianceReporting.equals("Expense") ? "E" : varianceReporting;
            varianceReporting = varianceReporting.trim();
           } else {
            System.out.println(nodAttr.getNodeName() + ": " + nodAttr.getNodeValue());
           }
       }
       
       consolidation = consolidation.equals("") ? "+" : consolidation;
       consolidation = consolidation.equals("") ? "" : " " + consolidation;
       timeBalance = timeBalance.equals("") ? "" : " " + timeBalance;
       dataStorage = dataStorage.equals("") ? "" : " " + dataStorage;
       twoPassCalc = twoPassCalc.equals("") ? "" : " " + twoPassCalc;
       varianceReporting = varianceReporting.equals("") ? "" : " " + varianceReporting;
       
       String properties = consolidation + timeBalance + dataStorage + twoPassCalc + varianceReporting;
          
       output += parentName.replaceAll(" +", " ").trim()
         + delimiter + memberName.replaceAll(" +", " ").trim()
         + delimiter + alias.replaceAll(" +", " ").trim()
         + delimiter + properties.trim()
         + delimiter + memberFormula
         + delimiter + UDA.replaceAll(" +", " ").trim() + "\n";
      }
      }
    }
    System.out.println("udaCountTotal: " + udaCountTotal);
    String udaHeader = "";
    for (int i = 0; i < udaCountTotal; i++) {
     udaHeader += "UDA0," + dimName + delimiter;
    }
    header += udaHeader;
    output = header + "\n" + output;
    outputStream.append(output);
    outputStream.close();
  }
  
  public void fatalError(SAXParseException spe) throws SAXException {
    System.out.println("Fatal error at line " + spe.getLineNumber());
    System.out.println(spe.getMessage());
    throw spe;
  }

  public void warning(SAXParseException spe) {
    System.out.println("Warning at line " + spe.getLineNumber());
    System.out.println(spe.getMessage());
  }

  public void error(SAXParseException spe) {
    System.out.println("Error at line " + spe.getLineNumber());
    System.out.println(spe.getMessage());
  }
}