Archive

Archive for May, 2011

Converting strings to lowercase in bash

May 25, 2011 1 comment

I needed to convert the values of a variable to lowercase for comparison purposes. A quick search on Google yielded some very interesting results from stackoverflow

I settled on the tr example.

string="A FEW WORDS"
$ echo $string | tr '[:upper:]' '[:lower:]'
a few words

But this just blew me away. Pity I do not have Bash 4. And no. I can’t install it on the server.

In Bash 4:

To lowercase

$ string="A FEW WORDS"
$ echo ${string,,}
a few words

$ echo ${string,}
a FEW WORDS

$ echo ${string,,[AEIUO]}
a FeW WoRDS

To uppercase

$ string="a few words"

$ echo ${string^}
A few words

$ echo ${string^^}
A FEW WORDS

$ echo ${string ^^[aeiou]}
A fEw wOrds

Toggle (undocumented)

$ string="A Few Words"

$ echo ${string~~}
a fEW wORDS

$ string="A FEW WORDS"

$ echo ${string~}
a fEW wORDS

$ string="a few words"

$ echo ${string~}
A Few Words

Title case:

$ string="a few words"
$ string=($string)
$ string=${string[@]^}
$ echo $string
A Few Words

Please visit the stackoverflow link provided above for more examples.

Advertisements
Categories: bash Tags: , , ,

Automatically update spreadsheets and graphs – Part 2

May 20, 2011 Leave a comment

Finally got some time to work on this. So I modified the initial code I posted here. This “newer” version is able to work with the spreadsheet and worksheet name that you pass on to it.

The data I have in the csv file (mydata.csv) looks like this

Date, Toy Cars sold
2011-01-01, 91
2011-01-02, 92
2011-01-03, 93
2011-01-04, 94
2011-01-05, 95

Usage:

shell>python an_example.py mydata.csv
or
shell>python an_example.py --user [username] --pw [password] --sname [spreadsheet name] --wname [worksheet name] mydata.csv
or
shell>python an_example.py --user [username] --pw [password] --cname [category] --sname [spreadsheet name] --wname [worksheet name] mydata.csv

If your email address is not in this format “user@gmail.com” then use the full email address for your google apps account “user@googleappsaccount.com”.

#!/usr/bin/python

import csv
import getpass
import gdata.docs.service
import gdata.spreadsheet.service
import getopt
import sys
import urllib

def get_data(csvfile):
    """Read data from csv file.

    Read data from the given csv file and return it as a list.

    Keyword arguments:
    csvfile -- The csv file that has the data to be written to the worksheet file.

    """
    reader = csv.reader(open(csvfile, "r"))

    data = []
    for row in reader:
      data.append(list(row))

    return data
        
def get_items(feed):
    """ Get the items in the feed.
    
    Either a list of documents that the user has or a list of worksheets within
    a given spreadsheet.

    Keyword arguments:
    feed -- The feed (xml file).
    """
    items = {}
    for entry in feed.entry:
        id_parts = urllib.unquote(entry.id.text).replace(':','/').split('/')
        key = id_parts[len(id_parts) - 1]
        items[entry.title.text.lower()] = key
    return items

def usage():
    """ How this works """ 
    usage_text = """python an_example.py --user [username] --pw [password] 
                --cname [category] --sname [spreadsheet name] 
                --wname [worksheet name] csv"""
    print usage_text

# Parse command line options
try:
    opts, args = getopt.getopt(sys.argv[1:], '', ['user=', 'pw=', 'cname=', 'sname=', 'wname='])
except getopt.error, msg:
    usage()
    sys.exit(2)

user = ''
pw = ''
cname = ''
sname = ''
wname = ''

# Process options
for option, arg in opts:
    if option == '--user':
      user = arg
    elif option == '--pw':
      pw = arg
    elif option == '--cname':
      cname = arg
    elif option == '--sname':
      sname = arg
    elif option == '--wname':
      wname = arg

# Get the csv file
try:
    data, = args
    sheet_data = get_data(data)
except ValueError:
    print "You must specify a csv file"
    usage()
    sys.exit(2)
except IOError:
    print "You must specify a csv file"
    usage()
    sys.exit(2)

while not user:
    print 'NOTE: Please run these tests only with a test account.'
    user = raw_input('Please enter your username: ')

while not pw:
    pw = getpass.getpass()
    if not pw:
        print 'Password cannot be blank.'

while not sname:
    sname = raw_input('Please enter a spreadsheet name: ')
    if not sname:
        print 'Spreadsheet name cannot be blank.'
while not wname:
    wname = raw_input('Please enter a worksheet name: ')
    if not wname:
        print 'Worksheet name cannot be blank.'

try:
    category = cname
    email = user
    password = pw
    source = category 
    spreadsheet_name = sname.lower()
    worksheet_name = wname.lower()
except NameError:
    cname = ''

gd_client = gdata.docs.service.DocsService()
gd_client.ClientLogin(email, password, source=source)

gs_client = gdata.spreadsheet.service.SpreadsheetsService()
gs_client.email = email
gs_client.password = password
gs_client.ProgrammaticLogin()

try:
    # Get the SpreadSheet.
    if not category:
        feed = gd_client.GetDocumentListFeed()
    else:
        query = gdata.docs.service.DocumentQuery(categories=[category])
        feed = gd_client.Query(query.ToUri())

    docs = get_items(feed)

    #print docs # Comment this out to see the spreadsheets a user has.

    curr_key = docs[spreadsheet_name]

    # Get the WorkSheet within the SpreadSheet.
    feed = gs_client.GetWorksheetsFeed(curr_key)

    sheets = get_items(feed)
    #print sheets # Comment this out to see the worksheets a given spreadsheet has.

    curr_wksht_id = sheets[worksheet_name]

    # Write the data in the csv file to the given worksheet.
    rowx = 1
    for row in sheet_data:
        rowx += 1
        for colx, value in enumerate(row):
            colx += 1
            gs_client.UpdateCell(rowx, colx, value, curr_key, curr_wksht_id)

except KeyError:
    print "Sorry, the spreadsheet or worksheet name you specified does not exist."


Source:
All the code is from the sample files that come with the API. Mostly from these two files /gdata-2.0.14/samples/docs/docs_example.py and /gdata-2.0.14/samples/spreadsheets/spreadsheetExample.py

Errors:
1. If you give the wrong spreadsheet key.

  • gdata.service.RequestError: {‘status’: 400, ‘body’: ‘The spreadsheet at this URL could not be found. Make sure that you have the right URL and that the owner of the spreadsheet hasn't deleted it.’, ‘reason’: ‘Bad Request’}

2. Attempting to write data to the wrong location.
In my case I was attempting to write to a cell at position row 1, col 0 instead of row 1, col 1. The error clearly stats R1C0.

  • gdata.service.RequestError: {‘status’: 400, ‘body’: ‘Invalid query parameter value for R1C0.’, ‘reason’: ‘Bad Request’}

3. Using wrong email address.
I was using the wrong email address. So I specified the full email address otherwise it assumes that your address is username@gmail.com.

  • gdata.service.Error: Server responded with a 403 code
Categories: python Tags: , ,

Aspell – Using more than one language

May 11, 2011 4 comments

Problem: Need to spell check a word file and return mispelt words. Issue is that there are “false positives” as the words are for a different language, eg say Swahili and English words.

Solution: Aspell

Make sure you have aspell. The default dictionary mine had was english. Then install the dictionary for the other language you want to have. The dictionaries are here ftp://ftp.gnu.org/gnu/aspell/dict/0index.html. Download and follow install directions.

I got some sample text on Zanzibar from http://en.wikipedia.org/wiki/Zanzibar and saved it in a file called Zanzibar.txt.

shell> less Zanzibar.txt
Wildlife
The main island of Zanzibar, Unguja, has a fauna which reflects its connection to the African mainland during the last Ice Age. Endemic mammals with continental relatives include the Zanzibar red colobus, one of Africa's rarest primates, the Zanzibar red colobus may number only about 1500. Isolated on this island for at least 1,000 years, the Zanzibar red colobus (Procolobus kirkii) is recognized as a distinct species, with different coat patterns, calls and food habits than related colobus species on the mainland

Doing a spell check with aspell and the english dictionary

shell> cat Zanzibar.txt | aspell --lang=en list | sort -u
birdlife
colobus
genet
Jozani
kirkii
Pemba
Procolobus
servaline
Unguja

Do a spell check with the english and swahili dictionary. (I just pipe the results to aspell again and check the words using the swahili dictionary).

shell> cat Zanzibar.txt | aspell --lang=en list | aspell --lang=sw list | sort -u
birdlife
colobus
genet
Jozani
kirkii
Procolobus
servaline

If there are words that you would like to always ignore. You can put the in a file and have aspell ignore them.*

shell>less .aspell.sw.pws
personal_ws-1.1 sw
jozani

Note that the case of the word does not matter.

shell> cat Zanzibar.txt | aspell --lang=en list | aspell --lang=sw list --personal=/home/rodnee/.aspell.sw.pws     --dont-suggest | sort -u
birdlife
colobus
genet
kirkii
Procolobus
servaline

To ignore english words you can do the same, just save the file as en and set the language as en.*

shell>less .aspell.en.pws
personal_ws-1.1 en
birdlife

*You can put as many words as you like.

Killing MySQL queries

May 10, 2011 Leave a comment

Problem: Opps query on a table is taking too long. Due to
a) Lack of index
b) Forgot to put a limit.
The table in question has over 20Million rows.

Solution:
1. Connect to mysql as your given users eg guest
mysql> show process list;
+———–+———+——————+——–+———+——-+——-+——————+
| Id | User | Host | db | Command | Time | State | Info |
+———–+———+——————+——–+———+——-+——-+——————+
| 120 | guest | localhost:57031 | database1 | Query | 0 | NULL | show processlist |
| 156 | guest | localhost:52753 |database1 | Query | 40 | Copying to tmp table | SELECT * FROM table1 INNER JOI |
| 192 | guest | localhost:54919 | database1 | Sleep | 21491 | | NULL |
+———–+———+——————+——–+———+——-+——-+——————+

2. Identify the query you would like to stop.
mysql>kill 156;

3. Check to see that the process has gone away.
mysql> show processlist;
+———–+———+——————+——–+———+——-+——-+——————+
| Id | User | Host | db | Command | Time | State | Info |
+———–+———+——————+——–+———+——-+——-+——————+
| 120 | guest | localhost:57031 | database1 | Query | 0 | NULL | show processlist |
| 192 | guest | localhost:54919 | database1 | Sleep | 21491 | | NULL |
+———–+———+——————+——–+———+——-+——-+——————+

The documentation for the KILL command is here http://dev.mysql.com/doc/refman/5.0/en/kill.html.

Automatically update spreadsheets and graphs – Part 1

May 9, 2011 5 comments

Problem: Automatically update a spreadsheet and its corresponding graphs.

Solutions:
I had a number of solutions or so I thought.
1. Use python and xlwt. xlwt does not work with graphs. Atleast not at the moment.
2. Use python uno. The server running the script would need to have a local version of open office.
3. Use googledocs and python. (Final code here)

So far the 3rd solution is looking promising. I have a sample script and a sample the google doc that I was to edit.

#!/usr/bin/python

import gdata.spreadsheet.service

def _CellsUpdateAction(row,col,inputValue,key,wksht_id):
    entry = spr_client.UpdateCell(row=row, col=col, inputValue=inputValue,
            key=curr_key, wksht_id=curr_wksht_id)
    if isinstance(entry, gdata.spreadsheet.SpreadsheetsCell):
        print 'Updated!'

def _PrintFeed(feed):
    for i, entry in enumerate(feed.entry):
        if isinstance(feed, gdata.spreadsheet.SpreadsheetsCellsFeed):
            print '%s %s\n' % (entry.title.text, entry.content.text)
        elif isinstance(feed, gdata.spreadsheet.SpreadsheetsListFeed):
            print '%s %s %s' % (i, entry.title.text, entry.content.text)
            # Print this row's value for each column (the custom dictionary is
            # built using the gsx: elements in the entry.)
            print 'Contents:'
            for key in entry.custom:  
              print '  %s: %s' % (key, entry.custom[key].text) 
            print '\n',
        else:
            print '%s %s\n' % (i, entry.title.text)


email = 'user@gmail.com'
password = 'password'

spr_client = gdata.spreadsheet.service.SpreadsheetsService()
spr_client.email = email
spr_client.password = password
spr_client.ProgrammaticLogin()

feed = spr_client.GetSpreadsheetsFeed()
# List the spreadsheets that available.
# Totally unneccessary but just here so that you can see what index to give to input variable below.
_PrintFeed(feed) 
# Get the first one. (Otherwise pick the desired index)
input = 0 
id_parts = feed.entry[0].id.text.split('/')
curr_key = id_parts[len(id_parts) - 1]
print curr_key

feed = spr_client.GetWorksheetsFeed(curr_key)
id_parts = feed.entry[0].id.text.split('/')
curr_wksht_id = id_parts[len(id_parts) - 1]
print curr_wksht_id

# I just did it this way to test to see if it works.
date1 = "2011-01-01"  # Overwrites the existing dates.
date2 = "2011-01-02"  # Overwrites the existing dates.
date3 = "2011-01-03"  # Overwrites the existing dates.
weight1 = "30"
weight2 = "60"
weight3 = "90"
weight4 = "80"

_CellsUpdateAction(2, 2, "Date", curr_key, curr_wksht_id)
_CellsUpdateAction(3, 2, date1, curr_key, curr_wksht_id)
_CellsUpdateAction(4, 2, date2, curr_key, curr_wksht_id)
_CellsUpdateAction(5, 2, date3, curr_key, curr_wksht_id)
_CellsUpdateAction(2, 3, "Weight", curr_key, curr_wksht_id)
_CellsUpdateAction(3, 3, weight1, curr_key, curr_wksht_id)
_CellsUpdateAction(4, 3, weight2, curr_key, curr_wksht_id)
_CellsUpdateAction(5, 3, weight3, curr_key, curr_wksht_id)
_CellsUpdateAction(6, 3, weight4, curr_key, curr_wksht_id)

 

Some of the errors I got trying to get the above script working include

gdata.service.RequestError: {‘status’: 400, ‘body’: ‘We're sorry, a server error occurred. Please wait a bit and try reloading your spreadsheet.’, ‘reason’: ‘Bad Request’}

gdata.service.RequestError: {‘status’: 400, ‘body’: ‘Invalid query parameter value for R0C0.’, ‘reason’: ‘Bad Request’}

Before running code.
Graph before running the code.

And this is the graph after running the code.
Graph after running code.

Conclusion:
My main objective was updating a spreadsheet and having the graph change as well. So far python + googledocs works very nicely for me. I will upload more code once I get a chance to work on this some more.

Source:
Most of the code came from the spreadsheet.py example in the Google Spreadsheets Data API.

Instructions on how to install Google Spreadsheets Data API for python is available here. The instructions worked for me except that I used sudo python setup.py install to install the package.

Part 2

Categories: python Tags: , ,