Write your own version of Mint, part 1

No company can guarantee the security of your account, as more high-profile companies got hacked, I certainly would not trust services like Mint, which has access to all my bank accounts. But I want my own version of Mint, so I can use data to analyze spendings and produce quarterly financial report for my family.

It is actually not that hard to build a proof-of-concept software. Here is what I created last night:

  • Downloaded 3 months of statements from 5 different banks that issued our credit cards. The format of the statement, depends on the bank, includes PDF, TXT, and CSV.
  • Created a folder for each credit card’s statements, with folder names as “amazon”, “bluecash”, “citi” for examples.
  • Wrote a parser, e.g. “ChaseStatementParser” for each credit card statement. So 5 parsers in total. I used Apache Commons CSV and Apache PDF Box. Below is the Java code of this particular parser.
  • Wrote a CSV file writer. I used the new Java features such as try with resources and Lambda.
  • I ended up having about 10 source code files with less than 1000 lines of code in total – a light program indeed.
public class ChaseStatementParser implements StatementParser {

    public List<ExpenseEntity> extract(String statementFilePath) {

        List<ExpenseEntity> entities = new ArrayList<>();

        String newStatementFilePath = convertPDFFile(statementFilePath);

        try (Stream<String> stream = Files.lines(Paths.get(newStatementFilePath))) {

            List<String> activities = new ArrayList<>();
            activities = stream.filter(line -> line.matches("^\\d\\d/\\d\\d\\s.*")).collect(Collectors.toList());
            for (String s : activities) {
                try {
                    int firstBlank = s.indexOf(" ");
                    int lastBlank = s.lastIndexOf(" ");
                    float amount = Float.valueOf(s.substring(lastBlank));
                    if (amount < 0) { // This means payment
                    String date = s.substring(0, firstBlank).trim();
                    String desc = s.substring(firstBlank, lastBlank).trim();

                    ExpenseEntity entity = new ExpenseEntity(date, desc, amount);
                } catch (Exception ex) {
                    System.out.println("Error while processing line: " + s +
                            ", exception: "+ ex.getMessage());
        } catch (Exception ex) {
            System.out.println("Error while reading file " + newStatementFilePath + ", exception: " + ex.getLocalizedMessage());

        return entities;

    private String convertPDFFile(String inputFilePath) {

        final String outputFilePath = "/tmp/" + UUID.randomUUID().toString();
        try {
            ExtractText.main(new String[]{inputFilePath, outputFilePath});
        } catch (IOException ex) {
            System.out.println("Failed to extract text from PDF file: " + inputFilePath + ", error: " + ex.getMessage());
            return null;
        return outputFilePath;

The output for this program will generate a CSV file that puts all the transactions from multiple credit cards together, something like:

Date Description Amount Card Category
09/20 Netflix 10.94 Chase
09/22 Costco 98.05 Citi

See the last column being empty? This is one of the next steps that I want to leverage some basic regular expression or Machine Learning to automatically classify the category based on the description text and amount. Right now I will manually label so I can have the training data set.

In the future I want to add a web portal for the program, which is only accessible from the internal network.

Needless to say this approach is heavily depending on the format of statement files. If bank changes that, the parser has to be changed accordingly.

Read Part 2 here.


One thought on “Write your own version of Mint, part 1

  1. Pingback: Write your own version of Mint, part 2 – 齊天大聖

Comments are closed.