Category: Tutorial

Day 11: Store your data in MongoDB

11_librecatMongoDB is a cross-platform document-oriented database. As a NoSQL database, MongoDB uses JSON-like documents (BSON) with dynamic schemas, making the integration of data in applications easier and faster. Install guides for various platforms are available at the MongoDB manual. To install the corresponding Catmandu module run:

$ cpanm Catmandu::Store::MongoDB

Now get some JSON data to work with:

$ wget -O banned_books.json https://lib.ugent.be/download/librecat/data/verbannte-buecher.json

First import the data to MongoDB. You have to specify in which database (--database_name) and collection (--bag) you want to store the data:

$ catmandu import -v JSON --multiline 1 to MongoDB --database_name books --bag banned < banned_books.json

Now you can export all items from a collection to different formats, like XLSX, YAML and XML:

$ catmandu export MongoDB --database_name books --bag banned to YAML
$ catmandu export MongoDB --database_name books --bag banned to XML
$ catmandu export -v MongoDB --database_name books --bag banned to XLSX --file banned_books.xlsx

You can count all items in a collection or those which match a query:

$ catmandu count MongoDB --database_name books --bag banned
$ catmandu count MongoDB --database_name books --bag banned --query '{"firstEditionPublicationYear": "1937"}'
$ catmandu count MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace": "Berlin"}'

MongoDB uses a JSON-like query language that supports a variety of operators.

You can query a collection for a specific value and export all matching items:

$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationYear": "1937"}' to JSON
$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace": "Berlin"}' to CSV --fields '_id,authorFirstname,authorLastname,title,firstEditionPublicationPlace'

You can use regular expressions for queries, e.g. to get all items which where published at a place starting with “B”:

$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace": {"$regex":"^B.*"}}' to CSV --fields '_id,firstEditionPublicationPlace'

MongoDB supports several comparison operators, e.g. you can query items which where published before/after a specific date or at specific places:

$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationYear": {"$lt":"1940"}}' to CSV --fields '_id,firstEditionPublicationYear'
$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationYear": {"$gt":"1940"}}' to CSV --fields '_id,firstEditionPublicationYear'
$ catmandu export MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace":{"$in":["Berlin","Bern"]}}' to CSV --fields '_id,firstEditionPublicationPlace'

Logical operators are also supported, so you can combine query clauses:

$ catmandu export MongoDB --database_name books --bag banned --query '{"$and":[{"firstEditionPublicationYear": "1937"},{"firstEditionPublicationPlace": "Berlin"}]}' to JSON
$ catmandu export MongoDB --database_name books --bag banned --query '{"$or":[{"firstEditionPublicationPlace": "Berlin"},{"secondEditionPublicationPlace": "Berlin"}]}' to JSON

With the element query operators you can match items that contain a specified field

$ catmandu export MongoDB --database_name books --bag banned --query '{"field_xyz":{"$exists":"true"}}'

Collection and items can be moved within MongoDB or even to other stores or search engines:

$ catmandu move MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace": "Berlin"}' to MongoDB --database_name books --bag berlin
$ catmandu move MongoDB --database_name books --bag banned to Elasticsearch --index_name books --bag banned

You can delete whole collections from a database or just items which match a query:

$ catmandu delete MongoDB --database_name books --bag banned --query '{"firstEditionPublicationPlace": "Berlin"}'
$ catmandu delete MongoDB --database_name books --bag banned

MongoDB supports several more methods. These methods are not available via the Catmandu commandline interface, but can be used in Catmandu modules and scripts.

See Catmandu::Store::MongoDB for further documentation.

Continue to Day 12: Index your data with ElasticSearch >>

Advertisements

Day 10: Working with CSV and Excel files

10_librecatCSV and Excel files are widely-used to store and exchange simple structured data. Many open datasets are published as CSV files, e.g. datahub.io. Within the library community CSV files are used for the distribution of title lists (KBART), e.g Knowledge Base+. Excel spreadsheets are often used to generate reports.

Catmandu implements importer and exporter for both formats. The CVS module is already part of the core system, the Catmandu::XLS and Catmandu::Exporter::Table modules may have to be installed separatly (note these steps are not required if you have the virtual catmandu box):

$ sudo cpanm Catmandu::XLS
$ sudo cpanm Catmandu::Exporter::Table

Get some CSV data to work with:

$ curl "https://lib.ugent.be/download/librecat/data/goodreads.csv" > goodreads.csv

Now you can convert the data to different formats, like JSON, YAML and XML.

$ catmandu convert CSV to XML < goodreads.csv
$ catmandu convert CSV to XLS --file goodreads.xls < goodreads.csv
$ catmandu convert XLS to JSON < goodreads.xls
$ catmandu convert CSV to XLSX --file goodreads.xlsx < goodreads.csv
$ catmandu convert XLSX to YAML < goodreads.xlsx

You can extract specified fields while converting to another tabular format. This is quite handy for analysis of specific fields or to generate reports.

$ catmandu convert CSV to TSV --fields ISBN,Title < goodreads.csv
$ catmandu convert CSV to XLS --fields 'ISBN,Title,Author' --file goodreads.xls < goodreads.csv

The field names are read from the header line or must be given via the ‘fields’ parameter.

By default Catmandu expects that CSV fields are separated by comma ‘,’ and strings are quoted with double qoutes ‘”‘. You can specify other characters as separator or quotes with the parameters ‘sep_char’ and ‘quote_char’:

$ echo '12157;$The Journal of Headache and Pain$;2193-1801' | catmandu convert CSV --header 0 --fields 'id,title,issn' --sep_char ';' --quote_char '$'

In the example above we create a little CSV fragment using to “echo” command for our small test. It will print a tiny CSV string which uses “;” and “$” as separation and quotation characters.

When exporting data a tabular format you can change the field names in the header or omit the header:

$ catmandu convert CSV to CSV --fields 'ISBN,Title,Author' --columns 'A,B,C' < goodreads.csv
$ catmandu convert CSV to TSV --fields 'ISBN,Title,Author' --header 0 < goodreads.csv

If you want to export complex/nested data structures to a tabular format, you must “flatten” the datastructure. This could be done with “Fixes“.

See Catmandu::Importer::CSV, Catmandu::Exporter::CSV and Catmandu::XLS for further documentation.

Continue in Day 11: Store your data in MongoDB >>

Day 9: Processing MARC with Catmandu

09_librecatIn the previous days we learned how we can use the catmandu command to process structured data like JSON. Today we will use the same command to process MARC metadata records. In this process we will see that MARC can be processed using JSON paths but this is a bit cumbersome. We will introduce MARCspec as an easier way to point to parts of a MARC record.

As always, you need to startup your Virtual Catmandu (hint: see our day 1 tutorial) and start up the UNIX prompt (hint: see our day 2 tutorial).

In the Virtual Catmandu installation we provided a couple of example MARC files that we can inspect with the UNIX command cat or less. In the UNIX prompt inspect the file Documents/camel.usmarc, for instance, with cat:

$ cat Documents/camel.usmarc

You should see something like this:

Screenshot_01_12_14_09_41

Like JSON the MARC file contains structured data but the format is different. All the data is on one line, but there isn’t at first sight a clear separation between fields and values. The field/value structure there but you need to use a MARC parser to extract this information. Catmandu contains a MARC parser which can be used to interpret this file. Type the following command to transform the MARC data into YAML (which we introduced in the previous posts):

$ catmandu convert MARC to YAML < Documents/camel.usmarc

You will see something like this:

Screenshot_01_12_14_10_01

When transforming MARC into YAML it looks like something with a simple top level field _id containing the identifier of the MARC record and a record field with a deeper array structure (or more correct an array-of-an-array structure).

We can use catmandu to read the _id fields of the MARC record with the retain_field fix we learned in the Day 6 post:

$ catmandu convert MARC --fix 'retain_field(_id)' to YAML < Documents/camel.usmarc

You will see:

---
_id: 'fol05731351 '
...
---
_id: 'fol05754809 '
...
---
_id: 'fol05843555 '
...
---
_id: 'fol05843579 '
...

What is happening here? The MARC file Documents/camel.usmarc contains more than one MARC record. For every MARC record catmandu extracts the _id field.

Extracting data out of the MARC record itself is a bit more difficult. MARC is an array-an-array, you need indexes to extract the data. For instance the MARC leader is usually in the first field of a MARC record. In the previous posts we learned that you need to use the 0 index to extract the first field out of an array:

$ catmandu convert MARC --fix 'retain_field(record.0)' to YAML < Documents/camel.usmarc
---
_id: 'fol05731351 '
record:
- - LDR
  - ~
  - ~
  - _
  - 00755cam  22002414a 4500
...

The leader value itself is the fifth entry in the resulting array. So, we need index 4 to extract it:

$ catmandu convert MARC --fix 'copy_field(record.0.4,leader); retain_field(leader)' to YAML < Documents/camel.usmarc

We used here a copy_field fix to extract the value into a field called leader. The retain_field fix is used to keep only this leader field in the result. To process MARC data this way would be very verbose, plus you need to know at which index position the fields are that you are interested in. This is something you usually don’t know.

Catmandu introduces Carsten Klee’s MARCspec to ease the extraction of MARC values out of a record. With the marc_map fix the command above would read:

marc_map("LDR",leader)
retain_field(leader)

I skipped here writing the catmandu commands (they will be the same everytime). You can put these fixes into a file using nano (see the Day 5 post) and execute it as:

catmandu convert MARC --fix myfixes.txt to YAML < Documents/camel.usmarc

Where myfixes.txt contains the fixes above.

To extract the title fields, the field 245 remember? ;), you can write:

marc_map("245",title)
retain_field(title)

Or, if you are only interested in the $a subfield you could write:

marc_map("245a",title)
retain_field(title)

More elaborate mappings are possible. I’ll show you more complete examples in the next posts. As a warming up, here is some code to extract all the record identifiers, titles and isbn numbers in a MARC file into a CSV listing (which you can open in Excel).

Step 1, create a fix file myfixes.txt containing:

marc_map("245",title)
marc_map("020a",isbn.$append)
join_field(isbn,",")
remove_field(record)

Step 2, execute this command:

$ catmandu convert MARC --fix myfixes.txt to CSV < Documents/camel.usmarc

You will see this as output:

_id,isbn,title
"fol05731351 ","0471383147 (paper/cd-rom : alk. paper)","ActivePerl with ASP and ADO /Tobias Martinsson."
"fol05754809 ",1565926994,"Programming the Perl DBI /Alligator Descartes and Tim Bunce."
"fol05843555 ",,"Perl :programmer's reference /Martin C. Brown."
"fol05843579 ",0072120002,"Perl :the complete reference /Martin C. Brown."
"fol05848297 ",1565924193,"CGI programming with Perl /Scott Guelich, Shishir Gundavaram & Gunther Birznieks."
"fol05865950 ",0596000138,"Proceedings of the Perl Conference 4.0 :July 17-20, 2000, Monterey, California."
"fol05865956 ",1565926099,"Perl for system administration /David N. Blank-Edelman."
"fol05865967 ",0596000278,"Programming Perl /Larry Wall, Tom Christiansen & Jon Orwant."
"fol05872355 ",013020868X,"Perl programmer's interactive workbook /Vincent Lowe."
"fol05882032 ","0764547291 (alk. paper)","Cross-platform Perl /Eric F. Johnson.

In the fix above we mapped the 245-field to the title. The ISBN is in the 020-field. Because MARC records can contain one or more 020 fields we created an isbn array using the isbn.$append syntax. Next we turned the isbn array back into a comma separated string using the join_field fix. As last step we deleted all the fields we didn’t need in the output with the remove_field syntax.

In this post we demonstrated how to process MARC data. In the next post we will show some examples how catmandu typically can be used to process library data.

Continue with Day 10: Working with CSV and Excel files >>

Day 8: Processing JSON data from webservices

08_librecatDuring the last two days we got an introduction into Catmandu and learned how to transform structured JSON data. The JSON data in these examples was first fetched from a an URL with command curl. Today we will learn how to simplify fetching more data from web services.

In short, a web service is a server that can be queried by HTTP requests. Most web services return JSON data if queried with an URL. For instance the weather web service used during the last two days is documented at openweathermap.org/api. To retrieve current weather data from selected cities, we used commands and URLs like this:


$ curl http://api.openweathermap.org/data/2.5/weather?q=Gent,be
$ curl http://api.openweathermap.org/data/2.5/weather?q=Tokyo,jp

The URLs only differ in its query parameter q, so we can construct a so called URL template. The form of an URL template is defined in RFC 6570, so our template is:

http://api.openweathermap.org/data/2.5/weather?q

Catmandu supports URL templates to retrieve JSON data with its getJSON Importer. Let’s use it to fetch weather data for Toyko:

$ echo '{"q":"Tokyo,jp"}' | catmandu convert getJSON --url 'http://api.openweathermap.org/data/2.5/weather{?q}'

URL templates make most sense, if applied with multiple values, so let’s create a list of cities. We could use a text editor, such as learned at day 5 but here is an alternative way to learn something new:

$ echo q > cities.csv
$ echo Ghent,be >> cities.csv
$ echo Tokyo,jp >> cities.csv
$ echo Berlin,de >> cities.csv
$ catmandu convert CSV --sep_char _ to JSON < cities.csv > cities.json

We first created the CSV file cities.csv by appending one line after another. The > character is used to pipe output to a file and >> can be used to append to a file instead of overwriting it. You will learn more about processing CSV files in a later article. The last command converts the CSV file to line-separated JSON. Have a look at both files with cat:

$ cat cities.csv
q
Ghent,be
Tokyo,jp
Berlin,de

$ cat cities.json
{"q":"Ghent,be"}
{"q":"Tokyo,jp"}
{"q":"Berlin,de"}

Now we can finally use this list of cities to retrieve weather data in one call:

$ cat cities.json | catmandu convert getJSON --url 'http://api.openweathermap.org/data/2.5/weather{?q}'

Try to append to YAML or to JSON --pretty 1 to this command to get a better view of the data, as described in introduction into catmandu (day 6)!

To better see what’s going on we can skip retrieving data and just get the full URLs instead. This is done by setting the option --dry to 1:

$ catmandu convert getJSON --dry 1 --url 'http://api.openweathermap.org/data/2.5/weather{?q}' < cities.json

With the knowledge from previous days we can extract some information. Here is an improved fix to get both name, and temperature:

retain_field(main.temp)
move_field(name,main.name)
retain_field(main)

Save this fix as file weather2.fix and get temperate of cities of your choice:

$ cat cities.json | catmandu convert getJSON --url 'http://api.openweathermap.org/data/2.5/weather{?q}' --fix weather2.fix

The getJSON Importer get be used to retrieve JSON data from various web services. Catmandu further includes specialized importers for selected web services, for instance:

Continue to Day 9: Processing MARC with Catmandu >>

Day 7: Catmandu JSON paths

07_librecatrojectYesterday we learned the command catmandu and how it can be used to parse structured information. Today we will go deeper into catmandu and describe how to pluck data out of structured information. As always, you need to startup your Virtual Catmandu (hint: see our day 1 tutorial) and start up the UNIX prompt (hint: see our day 2 tutorial).

Today will we fetch a new weather report and store it in a new file weather2.json. Lets try to download Tokyo:

$ curl https://bit.ly/2J9sd6N > weather2.json

From the previous tutorials we know many commands how to examine this data set. For instance, to get a quick overview of the content of weather2.json we can use the cat command:

$ cat weather2.json

Or, we could use the less command:

$ less weather2.json

Remember to type the ‘q’ key to exit less.

We could also use nano to inspect the data, but we skip that for now. Nano is a text editor and is not particularly suited for data.

To count the number of lines, words and characters in weather2.json we can use the wc command:

$ wc weather2.json
1 3 463

This output shows that weather2.json contains 1 line , 3 words and 463 characters. The 1 line is indeed correct: the file contains one big line of JSON. The 463 characters is also correct: when you count every character including spaces you get to 463. But 3 words is obviously wrong. Generic UNIX programs like wc have trouble with counting words in structured information. The command doesn’t know this file is in the JSON format which contains fields and values. You need to use specialized tools like catmandu to make sense of this output.

We also saw in the previous post how you can use catmandu to transform the JSON format into the YAML format which is easier to read and contains the same information:

$ catmandu convert JSON to YAML < weather2.json

Screenshot_28_11_14_14_06-2

 

We also learned some fixes to retrieve information out of the JSON file like retain_field(main.temp).

In this post we delve a bit deeper into ways how to point to fields in a JSON file.

This main.temp is called a JSON Path and points to a part of the JSON data you are interested in. The data, as shown above, is structured like a tree. There are top level simple fields like: base,cod,dt,id which contain only text values or numbers. There are also fields like coord that contain a deeper structure like lat and lon.

Using a JSON path you can point to every part of the JSON file using a dot-notation. For simple top level fields the path is just the name of the field:

  • base
  • cod
  • dt
  • id
  • name

For the fields with deeper structure you add a dot ‘.’ to point to the leaves:

  • clouds.all
  • coord.lat
  • coord.lon
  • main.temp
  • etc…

So for example. If you would have a deeply nested structure like:

Screenshot_28_11_14_14_34

Then you would point to the c field with the JSON Path x.y.z.a.b.c.

There is one extra path structure I would like to explain and that is the when a field can have more than one value. This is called an array and looks like this in YAML:

Screenshot_28_11_14_14_39

In the example above you see a field my which contains a deeper field colors which has 3 values. To point to one of the colors you need to use an index. The first index in a array has value 0, the second the value 1, the third the value 2. So, the JSON path of the color red would be:

  • my.color.2

In almost all programming languages things get counted starting with 0. An old programming joke is:

There are 10 types of people in the world:
Those who understand binary,
Those who don’t,
And those who count from zero.

(hint: this is a double joke, 10 in binary == 2 if you count from 0, or 3 when you count from 1).

There is one array type in our JSON report and that is the weather field. To point to the description of the weather you need the JSON Path weather.0.description.

In this post we learned the JSON Path syntax and how it can be used to point to parts of a JSON data set want to manipulate. We explained the JSON path using a YAML transformation as example, because this is easier to read. YAML and JSON are two formats that contain the same informational content (and thus both can work with JSON Path) but look different when written into a file.

Continue to Day 8: Processing JSON data from webservices >>

Day 6: Introduction into Catmandu

06_librecatprojectIn the previous days we learned the UNIX commands grep, nano, ls and less. Today we will introduce you to a UNIX command we have created in the LibreCat project called catmandu. The catmandu command is used to process structured information.  To demo this command, as always, you need to startup your Virtual Catmandu (hint: see our day 1 tutorial) and start up the UNIX prompt (hint: see our day 2 tutorial).

In this tutorial we are going to process structured information. We call data structured when it organised in such a way is that it easy processable by computers. Previously we processed text documents like War and Peace which is structured only in words and sentences, but a computer doesn’t know which words are part of the title or which words contain names. We had to tell the computer that. Today we will download a weather report in a structured format called JSON and inspect it with the command catmandu.

At the UNIX prompt type in this command:

$ curl http://api.openweathermap.org/data/2.5/weather?q=Gent,be

[Update: as of end 2015 the OpenWeatherMap API requires an API key. Use this link to download a copy of the Ghent weather report :

$ curl https://gist.githubusercontent.com/phochste/7673781b19690f66cada/raw/67050da98a7e04b3c56bb4a8bc8261839af57e35/weather.json

]

You will see a JSON output like:

{"coord":{"lon":3.72,"lat":51.05},"sys":{"type":3,"id":4839,"message":0.0349,"country":"BE",
"sunrise":1417159365,"sunset":1417189422},"weather":[{"id":500,"main":"Rain","description":"light rain",
"icon":"10d"}],"base":"cmc stations","main":{"temp":281.15,"pressure":1006,"humidity":87,"temp_min":281.15,
"temp_max":281.15},"wind":{"speed":3.6,"deg":100},"rain":{"3h":0.5}
,"clouds":{"all":56},"dt":1417166878,"id":2797656,
"name":"Gent","cod":200}

All these fields tell something about the current weather in Gent, Belgium. You can recognise that there is a light rain and the temperature is 281.15 degrees Kelvin (about 8 degrees Celsius).  Write the output of this command to a file weather.json (using the ‘>’ sign we learned in the day 5 tutorial) so that we can use it in the next examples.

$ curl https://gist.githubusercontent.com/phochste/7673781b19690f66cada/raw/67050da98a7e04b3c56bb4a8bc8261839af57e35/weather.json > weather.json

When you type the ls command you should see the new file name weather.json appearing.

With the catmandu command you can process this file to make it a bit easier readable. For instance type:

$ catmandu convert JSON to YAML < weather.json

YAML is another format for structured information which is a bit easier to read for human eyes. Our weather report should now look like this:

Screenshot_28_11_14_11_06

Catmandu can be used to process structured information like the UNIX grep command can process unstructured information. For instance lets try to filter out the name of this report. Type in this command:

$ catmandu convert JSON --fix 'retain_field(name)' to YAML < weather.json

You should end up with something like:

---
name: Gent
...

The –fix option in Catmandu is used to ‘massage’ the input weather.json filtering fields we would like to see. Only one fix was used ‘retain_field’, which throws away all the data from the input except the ‘name’ field. By the way, the file weather.json wasn’t changed! We only read the file and displayed the output of catmandu command.

The temperature in Gent is the in ‘temp’ part of the ‘main’ section in weather.json. To filter this out we need two retain_field fixes: one for the main section, one for the temp section:

$ catmandu convert JSON --fix 'retain_field(main); retain_field(main.temp)' to YAML < weather.json

You should now see something like this:

---
main:
  temp: 281.15
...

When massaging data you often need to create many fixes to process a data file in the format you need. With the nano command you can write all the fixes in a file. Start the nano editor with the command:

$ nano weather.fix

In nano type now the two fixes above:

retain_field(main)
retain_field(main.temp)

To exit nano type Ctrl-X, press Y to confirm the changes and press Enter to confirm the file name.

With this file it will be a bit easier to create many fixes. The name of the fix file can be used to repeat the commands above:

$ catmandu convert JSON --fix weather.fix to YAML < weather.json

To add more fixes we can again edit the weather.fix file. Type:

$ nano weather.fix

And add these lines after the two previous lines:


prepend(main.temp,"The temperature is")
append(main.temp," degrees Kelvin")

Save the changes with Ctrl-X, Y, Enter and execute catmandu  again:

$ catmandu convert JSON --fix weather.fix to YAML < weather.json

You should now see as ouput:

---
main:
  temp: The weather is 281.15 degrees Kelvin
...

Catmandu contains many fixes to manipulate data. Check the documentation to get a complete list. This post only presented a short introduction into catmandu. In the next posts we will go deeper into its capabilities.

Continue to Day 7: Catmandu JSON paths >>

Day 5: Editing text with nano

05_librecatprojectYesterday we looked at the commands grep, wc and less. Today we will show you how to store and edit files in UNIX First, as always, you need to startup your Virtual Catmandu (hint: see our day 1 tutorial) . Start up the UNIX prompt (hint: see our day 2 tutorial) and type in the command ‘nano’:

$ nano

You will be presented with the GNU nano text editor.

Screenshot_14_11_14_11_31

In this text editor you can type text or programs you can save on disk for later use. In this short tutorial I will guide you to some basic commands we will need in later tutorials. Type for instance a short text in this screen:

“Hello  world. My name is …”

When you want to save this text into a file type Ctrl-o (that is pressing the Ctrl-key and ‘o’ key on your keyboard). In the bottom of the screen nano will ask for a filename.

Screenshot_14_11_14_11_37

 

Type for instance ‘hello.txt’ as filename as press return. The file ‘hello.txt’ is now created on disk. We can test this with the commands we learned in the previous tutorial.

First exit the nano editor by typing Ctrl-x. And type ‘cat hello.txt’

$ cat hello.txt

You will see now the text created in the nano editor. With the UNIX command ‘ls‘ you can view all the filenames in the current directory.

$ ls

If you want to add more text to this file you can start again the nano editor with a file name.

$ nano hello.txt

You will again see the text you can edit and save again with Ctrl-o and exit nano with Ctrl-x.

Output of UNIX commands can also be written to a file.  Lets try to find all the lines in War and Piece that contain Bolkonski and inspect the results with nano:

$ cat Documents/war_and_peace.txt | grep Bolkonski > bolkonski.txt

Here we use the key ‘>’ to redirect the output of the command grep to a file named ‘bolkonski.txt’. Next we can use nano to inspect the contents of this file.

$ nano bolkonski.txt

By the way, you don’t need to type in the complete filenames in all the commands we have shown in the examples. When you type ‘bo’ and hit the tab-key then UNIX will autocomplete the file name to ‘bolkonski.txt’. I’m lazy and would type ‘cat bol’ and press tab .

Again you can use Ctrl-x to exit nano. You can view all the files with the ls command.

$ ls

bolkonski.txt  Documents  hello.txt  Pictures  
Templates  Videos Desktop        Downloads  
Music      Public    test.fix

If you want to delete a file you can use the rm command. We can try to remove our bolkonski.txt file with like:

$ rm bolkonski.txt

This concludes our short excursion into UNIX. Monday we will be back with a new chapter: processing JSON with Catmandu. Have a nice weekend!

Continue with Day 6: Introduction to Catmandu >>