Python interpreter tab completion on OS X

This is mainly for me own reference, but it is probably something that other people will find useful. If you attempt to use the rl­com­pleter module on the Mac you'll find that tab completion does not work correctly. Apparently this is because Apple ship a neutered version of readline.

To get completion working you have to use a different argument to readline.parse_and_bind(). The base example uses:

readline.parse_and_bind("tab: complete")

Change that line to:

readline.parse_and_bind ("bind ^I rl_complete")

Tagged with mac, osx and python.

Scrapy dependency problems with lxml

Following the recent PhillyPUG meetup I was trying to install scrapy on an old MacBook Pro running OS X 10.6 (Snow Leopard) and ran into a number of problems with the lxml dependency. This is the parser used to extract data from pages that scrapy downloads so you are not going to get very far without it.

It seems that the com­pi­la­tion problems ex­pe­ri­enced when installing with pip result from an attempt to build a universal binary. If you have Xcode 4 installed then you lose some of this capability and need to make sure that the correct ar­chi­tec­ture is specified.

Ar­chi­tec­ture Fix

Setting the ar­chi­tec­ture is something you can do in your bash profile, executing it under a new bash ensures that the build script picks it up.

sudo bash
export ARCHFLAGS='-arch i386 -arch x86_64'
pip install lxml # test it
pip install scrapy --upgrade # fix the failed scrapy install

Original Error

brianly$ sudo pip install lxml --upgrade
Downloading/unpacking lxml
Downloading lxml-2.3.tar.gz (3.2Mb): 3.2Mb downloaded
Running setup.py egg_info for package lxml
  Building lxml version 2.3.
  Building without Cython.
  Using build configuration of libxslt 1.1.24
  warning: no previously-included files found matching '*.py'
Installing collected packages: lxml
Found existing installation: lxml 2.2.2
  Uninstalling lxml:
    Successfully uninstalled lxml
Running setup.py install for lxml
  Building lxml version 2.3.
  Building without Cython.
  Using build configuration of libxslt 1.1.24
  building 'lxml.etree' extension
  gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -I/usr/include/libxml2 -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.macosx-10.6-universal-2.6/src/lxml/lxml.etree.o -w -flat_namespace
  /usr/libexec/gcc/powerpc-apple-darwin10/4.2.1/as: assembler (/usr/bin/../libexec/gcc/darwin/ppc/as or /usr/bin/../local/libexec/gcc/darwin/ppc/as) for architecture ppc not installed
  Installed assemblers are:
  /usr/bin/../libexec/gcc/darwin/x86_64/as for architecture x86_64
  /usr/bin/../libexec/gcc/darwin/i386/as for architecture i386
  src/lxml/lxml.etree.c:161594: fatal error: error writing to -: Broken pipe
  compilation terminated.
  lipo: can't open input file: /var/tmp//ccYr9GpX.out (No such file or directory)
  error: command 'gcc-4.2' failed with exit status 1
  Complete output from command /usr/bin/python -c "import setuptools;__file__='/Users/brianly/dev/github/pyconscrape/build/lxml/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --single-version-externally-managed --record /tmp/pip-axeEA7-record/install-record.txt:
  Building lxml version 2.3.

Building without Cython.

Using build configuration of libxslt 1.1.24

running install

running build

running build_py

creating build

creating build/lib.macosx-10.6-universal-2.6

creating build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/__init__.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/_elementpath.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/builder.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/cssselect.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/doctestcompare.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/ElementInclude.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/pyclasslookup.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/sax.py -> build/lib.macosx-10.6-universal-2.6/lxml

copying src/lxml/usedoctest.py -> build/lib.macosx-10.6-universal-2.6/lxml

creating build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/__init__.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/_dictmixin.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/_diffcommand.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/_html5builder.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/_setmixin.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/builder.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/clean.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/defs.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/diff.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/ElementSoup.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/formfill.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/html5parser.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/soupparser.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

copying src/lxml/html/usedoctest.py -> build/lib.macosx-10.6-universal-2.6/lxml/html

creating build/lib.macosx-10.6-universal-2.6/lxml/isoschematron

copying src/lxml/isoschematron/__init__.py -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron

creating build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources

creating build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/rng

copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/rng

creating build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl

creating build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.macosx-10.6-universal-2.6/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

running build_ext

building 'lxml.etree' extension

creating build/temp.macosx-10.6-universal-2.6

creating build/temp.macosx-10.6-universal-2.6/src

creating build/temp.macosx-10.6-universal-2.6/src/lxml

gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -I/usr/include/libxml2 -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c src/lxml/lxml.etree.c -o build/temp.macosx-10.6-universal-2.6/src/lxml/lxml.etree.o -w -flat_namespace

/usr/libexec/gcc/powerpc-apple-darwin10/4.2.1/as: assembler (/usr/bin/../libexec/gcc/darwin/ppc/as or /usr/bin/../local/libexec/gcc/darwin/ppc/as) for architecture ppc not installed

Installed assemblers are:

/usr/bin/../libexec/gcc/darwin/x86_64/as for architecture x86_64

/usr/bin/../libexec/gcc/darwin/i386/as for architecture i386

src/lxml/lxml.etree.c:161594: fatal error: error writing to -: Broken pipe

compilation terminated.

lipo: can't open input file: /var/tmp//ccYr9GpX.out (No such file or directory)

error: command 'gcc-4.2' failed with exit status 1

----------------------------------------
  Rolling back uninstall of lxml
Exception:
Traceback (most recent call last):
  File "/Library/Python/2.6/site-packages/pip-1.0.1-py2.6.egg/pip/basecommand.py", line 126, in main
    self.run(options, args)
  File "/Library/Python/2.6/site-packages/pip-1.0.1-py2.6.egg/pip/commands/install.py", line 228, in run
    requirement_set.install(install_options, global_options)
  File "/Library/Python/2.6/site-packages/pip-1.0.1-py2.6.egg/pip/req.py", line 1104, in install
    requirement.rollback_uninstall()
  File "/Library/Python/2.6/site-packages/pip-1.0.1-py2.6.egg/pip/req.py", line 487, in rollback_uninstall
    self.uninstalled.rollback()
  File "/Library/Python/2.6/site-packages/pip-1.0.1-py2.6.egg/pip/req.py", line 1417, in rollback
    pth.rollback()
AttributeError: 'str' object has no attribute 'rollback'

Storing complete log in /Users/brianly/.pip/pip.log

Tagged with lxml, python, scrapy and scripting.

Installing Python Imaging on Windows 64-bit (AMD64) Python

If you want need a 64-bit version of this module for Google App Engine, or another project, you can run into a couple of issues. Sticking to 32-bit versions of Python and PIL avoids these issues so that may be the best direction for newbies.

The downloads on the PIL site are for 32-bit versions of Python which means that you hit runtime issues as soon as the Python script files attempt to load the 32-bit libraries into 64-bit Python. Annoyingly the installer is unaware of the ar­chi­tec­ture for the Python in­stal­la­tion it finds and will leave you with a broken PIL install. Just uninstall it from Control Panel. My first thought was to try an build from the sources but I figured someone has run into this problem.

After some searching I found a site which offers pre-built versions of many Python modules for 64-bit ar­chi­tec­tures including PIL. If you are in any way concerned about per­for­mance, or security a better solution might be to build from source or use versions from a company offering supported versions.

I downloaded the version for Python 2.5 and then discovered that it would not install. What is in­ter­est­ing is that the 32-bit installer was able to find my 64-bit Python, but the 64-bit installer was unable to find it.

After some more searching it turns out that the installer takes it’s cue from a key in the registry and the Registry vir­tu­al­iza­tion in Windows x64 was confusing it. To resolve the issue I had to export the “HKLM\­SOFT­WARE\­Wow6432N­ode\Python\Python­Core\2.5” key, remove the “Wow6432N­ode\” string using a text editor, and re-import the key so that it was at “HKLM\­SOFT­WARE\Python\Python­Core\2.5”.

After doing this the 64-bit installer for PIL detected my Python in­stal­la­tion and I was up and running.

Tagged with imaging, pil, python and windows.

Looking forward to PyCon 2010!

On Thursday I'm making the yearly pilgrimage to PyCon in Atlanta. This will be my third year and I'm sure it'll be better than ever.

For me the real highlights of this conference are the legendary open space sessions. The level of in­ter­ac­tion and learning at these really sets it apart from other con­fer­ences. Last year I attended a number of crackers including one on Cassandra and big data scal­a­bil­i­ty with Jonathan Ellis. It's great to see that Jonathan is delivering a scheduled talk on database scal­a­bil­i­ty, and I'm sure there will be a fair number of open space sessions dedicated to NoSQL databases.

Looking at the 2010 schedule these are my other top picks:

  1. Deployment, de­vel­op­ment, packaging, and a little bit of the cloud (Ian Bicking)
  2. Powerful Pythonic Patterns (Alex Martelli)
  3. Un­der­stand­ing the Python GIL (David Beazley)
  4. Mastering Team Play: Four powerful examples of composing Python tools (Raymond Hettinger)
  5. Unladen Swallow: fewer coconuts, faster Python (Collin Winter)

I'll have my camera with me again this year so watch out for pics under the pycon and pycon2010 tags on Flickr.

Tagged with atlanta, pycon and python.

Twitter to blog script

Twitter logoBased on an example provided with the Twitter library for Python I cobbled together the following script to add my latest tweets to this site. It's called from a cron job that I run on an occasional basis. My script linkifies hashtags and @username tokens in tweets so that you can see search results or user in­for­ma­tion.

Why did I not use one of the WordPress widgets? Well writing scripts like this is fun, and some widgets don't seem to play too well with my Thesis theme. One thing to note is that getting the shell script setup under some cron con­fig­u­ra­tions can take a while if you aren't using it on a regular basis. It's operation is also different between Ubuntu Server and Joyent's Ac­cel­er­a­tor platform.

Up next: a similar script to process my latest bookmarks on Delicious.com.

Main script (tweets.py)

#!/usr/bin/python

import codecs, re, getopt, sys, twitter

TEMPLATE = """
<li>
  <span class="twitter-text">%s</span>
  <span class="twitter-relative-created-at"><a href="http://twitter.com/%s/statuses/%s">Posted %s</a></span>
</li>
"""

def Usage():
  print 'Usage: %s [options] twitterid' % __file__
  print
  print '  This script fetches a users latest twitter update and stores'
  print '  the result in a file as an XHTML fragment'
  print
  print '  Options:'
  print '    --help -h : print this help'
  print '    --output : the output file [default: stdout]'


def FetchTwitter(user, output):
  assert user
  statuses = twitter.Api().GetUserTimeline(user=user, count=7)

  xhtml = []
  for status in statuses:
      status.text = Linkify(status.text)
      xhtml.append(TEMPLATE % (status.text, status.user.screen_name, status.id, status.relative_created_at))

  if output:
    Save(''.join(xhtml), output)
  else:
    print ''.join(xhtml)

def Linkify(tweet):
    tweet = re.sub(r'(\A|\s)@(\w+)', r'\1@\2', tweet)
    return re.sub(r'(\A|\s)#(\w+)', r'\1#\2', tweet)

def Save(xhtml, output):
  out = codecs.open(output, mode='w', encoding='utf-8',
                    errors='xmlcharrefreplace')
  out.write(xhtml)
  out.close()

def main():
  try:
    opts, args = getopt.gnu_getopt(sys.argv[1:], 'ho', ['help', 'output='])
  except getopt.GetoptError:
    Usage()
    sys.exit(2)
  try:
    user = args[0]
  except:
    Usage()
    sys.exit(2)
  output = None
  for o, a in opts:
    if o in ("-h", "--help"):
      Usage()
      sys.exit(2)
    if o in ("-o", "--output"):
      output = a
  FetchTwitter(user, output)

if __name__ == "__main__":
  main()

Shell script executed as cron job (tweets.sh)

/usr/bin/python /path/to/tweets.py brianly --output /path/to/output/twittertimeline.htm

Tagged with cron, python, scripting and twitter.

PyCon 2008 - IronPython Highlights

IronPython was one of the factors that impacted my decision to attend PyCon. Microsoft are ap­proach­ing the release of version 2.0 which will have parity with CPython 2.5. The production versions already are close to full Python 2.4 support making it a viable platform for use in a lot of places where I would typically use C#.

Open space sessionGoing into the conference I was looking forward to the Sunday session with Jim Hugunin but there turned out to be some more treats for the IronPython developer. Feihong Hsu ran a session on Python.NET and how you can bridge from CPython to the .NET platform, taking advantage of rich Windows APIs. Michael Foord spoke on Sil­verlight as well as his company's spread­sheet which embeds IronPython.

Feihong organised an open space session for Saturday evening after the PyWin32 gathering to talk about Python.NET and we were joined by the IronPython developers and management (Dino Viehland, Harry Pierson, Jim Hugunin and others). We discussed a number of aspects of IronPython and progress towards the 2.0 release. It looks like this may be complete in October given that they released the first beta last week. Again Michael Foord had something in­ter­est­ing to say on what Resolver Systems are doing.

Michael Foord presents IronCladMichael presented an open source project called IronClad. This is quite an insane assortment of code from C# to Python to assembler all in the name of accessing Python modules written in C. To date they have the bzip2 module running but are working on support for modules like NumPy which are important to their customers.

After the open space session we headed into Chicago for dinner at India House. This gave us a chance to find out some more stuff about the IronPython im­ple­men­ta­tion, and other factoids. Dino hinted that he was working on getting Django up and running. Little did we know he was going to be demoing this to the crowd on Sunday.

Jim Hugunin and Dino ViehlandSunday saw Jim's big talk and I managed to get a few photos. It wasn't easy, but I think these turned out a bit better than earlier shots at the conference. Dino showed off the fairly minimal changes needed to get Django running on IronPython and Jim demoed the IronPython in­ter­preter running under Dynamic Sil­verlight.

After the keynote, Dino gave a me a quick run through of the IronPython and DLR source code. This was very in­ter­est­ing and it gave me a real step up in un­der­stand­ing what goes on under the covers. Thanks Dino!

Tagged with ironpython, pycon, pycon2008 and python.

PyCon 2008 - Day Zero

Today was the Python tutorial day. Given that I haven't spent a lot of time reading or writing Python code I thought it would be a good idea to attend some of these tutorials. Since they kicked off at 9am it was a bit of a challenge making it on time. My Southwest flight from Philly last night arrived late, and then I had a big trip around to the other side of Chicago. If you need to attend a conference near O'Hare, try to fly into that airport. All I wanted to do was to lie in bed for a few more hours :)

When I made my tutorial selections I was hoping to attend a Python for Java Developers session. This would have been useful given my experience with C# but it seems that I was in the minority and it was cancelled. I switched to the Django session but I think I may have been better attending the session on per­for­mance op­ti­mi­sa­tion.

Reg­is­tra­tion wasn't too busy today since the main conference crew won't arrive till Friday. I got a PyCon bag and some flyers but the T-Shirts weren't ready. Apparently they'll be available tomorrow, I'd hate to miss out on one!

Python 101 Tutorial (Steve Holden)

It turns out that Steve was another British ex-pat living here in the US. It gets weirder in that he lectured at Manchester University for a number of years. Given Steve's position in the community I expected a sharp in­tro­duc­tion to Python. It didn't disappoint and I picked up a fair bit. The 'slice' mechanism looks really useful, I wonder if it can be im­ple­ment­ed with any of the new C# features?

Getting Started with Django (Jacob Kaplan-Moss)

I was expecting this session to be a little more exciting. Jacob has some fine ideas about how Python frameworks should be built but his pre­sen­ta­tion style is not as striking as DHH. Since this was an in­tro­duc­to­ry session I can't complain too much but I really want to hear some more about Django deployment and debugging over the next few days.

Internet Pro­gram­ming with Python (Wesley Chun)

At this point I was pretty exhausted. This tutorial seemed to be geared toward newbies to network pro­gram­ming, rather than a best practice session on leveraging Python for internet pro­gram­ming.

Tagged with pycon, pycon2008 and python.