Converting a text file to XML with Python

Asked

Viewed 742 times

0

Imagine I have the following command:

echo -e "`date`\n\n`free`\n\n`vmstat`\n" >> free_vmstat_output.txt

the generated file (the output of the above command) would be (in TEXT mode):

Fri Mar 31 22:19:55 -03 2017

             total       used       free     shared    buffers     cached
Mem:      16387080    7085084    9301996     386628     147468    3340724
-/+ buffers/cache:    3596892   12790188
Swap:     13670396          0   13670396

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 9302192 147468 3340724    0    0    27    93  173   89  5  1 93  1  0

How to turn it into XML, that is, I need to read the free_vmstat_output.txt file, which is the output of the command and turn it into XML?

Fri Mar 31 22:19:55 -03 2017 -> seria uma tag data

Each element below would be a tag and enter its values:

total       used       free     shared    buffers     cached

I just started the following snippet of code:

#coding:utf-8

from xml.dom.minidom import Document

doc = Document()
root = doc.createElement('InfoMemoria')

1 answer

2


There’s no magic there - is a complex data structure, which you want to transform into another complex structure -

have to read the source file, line by line, separate the information elements of each row using the Python string tools (linha.strip().split("") should give you a list of each header or data field.)

And, for each desired value, you have to create a new node, at the right place, all of this in order to have an XML in which the information related to the vms will be something like: <procs><r>0</r><b><0></procs><memory><swpd>0</swpd> <free>9302192</free>...</memory>

Few things would be more boring than creating an xml like this, among which, CONSUME an xml like this.

So - let’s take a step back: first: if you will sweat Python to manipulate the data relating to the system, why use a complex command line to paste as text file the output of individual commands that has the information you want?

Do everything in Python - including calling external commands free and vmstat - For timestam, obviament you don’t need to call an external process, just use Python datetime

Second: Think about the program that will consume this information, and generate a structured data file that can be : easier to read from the other side, easier to generate with your Pythn program, more readable when looked at, and significantly smaller! XML loses in all these questions to almost any thing - it is even more verbose and less readable than collage as original text file. To send a structured and easily readable format, you can think of a JSON file, for example, instead of XML:

{"timestamp: ...
 "free": {"mem": {'total': ..., 'used': ...,  ...},
          "swap": {"total", ..., "used": ...}},
 "vmstat": ...
}

Or -if you want to generate several of these to monitor the use of the machine in someone else, it will make much more sense to put everything in a database, instead of generating XML or JSON packages. (ok, if XML or JSON can be to be transmitted to a remote process that does this).

Well, anyway, you create in the Python program a distinct function to execute the external command, parse the data, and already return the structured infection - as a Python dictionary is cool - Mesmoq I you choose to use XML - the dictionary will be a good intermediate data structure.

Example of a function that calls "free" and parses the data for a dictionary structure:

import subprocess

def get_free():
    data = subprocess.check_output("free").decode("utf-8").split("\n")
    memory = data[1].split()
    swap = data[2].split()
    result = {'memory': {'total': int(memory[1]), used: int(memory[2])},
              'swap': {'total': int(swap[1]), used: int(swap[2])}}
    return result

You can create a similar function to read the "vmstat" data, combine the dictionaries with a timestamp, and use direct "json.dump" to have all the data machine-readable and ready to be transmitted --- or, create another function that creates your XML based on already structured data.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.