Archives for August, 2011

26
Aug

Remove a File / Directory from a Tarball Without Extracting First

If you have an extra folder or file within a tarball (.tar) file, you can remove it without extracting the entire tarball first. This can be really handy when you have a massive file that you don’t want to spend a lot of time extracting and re-archiving. Open up a terminal shell and follow along with the example below.

We’ll first see what’s in the compressed tarball named sandbox.tar.gz by using the --list option of tar:

thelinuxdaily$ tar --list --file=sandbox.tar.gz
sandbox/
sandbox/delete_me/
sandbox/delete_me/hello.txt
sandbox/hello.txt
sandbox/hello2.txt
sandbox/hello3.txt
sandbox/save_me/
sandbox/save_me/hello.txt

Let’s try to delete the folder called sandbox/delete_me from the compressed tarball:

thelinuxdaily$ tar --delete --file=sandbox.tar.gz sandbox/delete_me
tar: Cannot update compressed archives
tar: Error is not recoverable: exiting now

See what happened? That means we need to uncompress it. In this case, it’s a .gz compression type, so we’ll use gunzip (if you were using bz2, you’d use bunzip2:

thelinuxdaily$ gunzip sandbox.tar.gz

Let’s try to delete the folder sandbox/delete_me again:

thelinuxdaily$ tar --delete --file=sandbox.tar sandbox/delete_me

We didn’t get an error message, so that’s good. Let’s see if it’s gone by using --list again:

thelinuxdaily$ tar --list --file=sandbox.tar
sandbox/
sandbox/hello.txt
sandbox/hello2.txt
sandbox/hello3.txt
sandbox/save_me/
sandbox/save_me/hello.txt
thelinuxdaily$ gzip sandbox.tar

Excellent! That’s what we needed to see. If you have any other tips, feel free to use the comments below.

25
Aug

Happy 20th Birthday Linux!

It all started with an email to a mailing list about an operating system that wasn’t going to be “big and professional like gnu”. So, happy birthday to Linux as of 25 Aug 91 20:57:08 GMT. Thanks for the good times so far!

Be sure to check out the special 20th anniversary page on The Linux Foundation’s page here:

http://www.linuxfoundation.org/20th/

3
Aug

How To Build a .tex File to .pdf on Linux

Most of the guides I came across were just wildly all over the place, so I figured a short and simple guide that gets right to the point would be useful. Here’s how to set up Fedora 15 to build (compile might be another word) a .tex file into a .pdf file from the command line. It can be applied to other Linux distributions as well. There are GUIs to help with this, but it’s best to start with the basics. So, fire up a terminal and let’s get started.

Step 1: Install the TexLive Package

The first thing we need to do is install the texlive package.

su -c 'yum install -y texlive'

Step 2: Prepare a .tex Document

Next, we’ll create a “hello world” type .tex file. Fire up your favorite text editor and copy/paste the following (I enjoy using vim since I can stay in the command line while building the LaTeX file). When you’re finished, save the file (I’ve saved it as “hello.tex”).

documentclass{book}

usepackage{lipsum}

begin{document}
chapter{Sample}

lipsum[1-4]
end{document}

Step 3: Build the .tex Document Into a PDF File

Finally, we’ll build (compile) the .tex source into a pretty PDF document using the command below:

pdflatex hello.tex

You’ll see something similar to this:

[dhildreth@drh hello_world]$ pdflatex hello.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
 %&-line parsing enabled.
entering extended mode
(./hello.tex
LaTeX2e <2005/12/01>
Babel <v3.8h> and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, arabic, basque, bulgarian, coptic, welsh, czech, slovak, german, ng
erman, danish, esperanto, spanish, catalan, galician, estonian, farsi, finnish,
 french, greek, monogreek, ancientgreek, croatian, hungarian, interlingua, ibyc
us, indonesian, icelandic, italian, latin, mongolian, dutch, norsk, polish, por
tuguese, pinyin, romanian, russian, slovenian, uppersorbian, serbian, swedish,
turkish, ukenglish, ukrainian, loaded.
(/usr/share/texmf/tex/latex/base/book.cls
Document Class: book 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/bk10.clo))
(/usr/share/texmf/tex/latex/lipsum/lipsum.sty)
No file hello.aux.
Chapter 1.
[1{/usr/share/texmf/fonts/map/pdftex/updmap/pdftex.map}] [2] (./hello.aux) )</u
sr/share/texmf/fonts/type1/bluesky/cm/cmbx12.pfb></usr/share/texmf/fonts/type1/
bluesky/cm/cmr10.pfb></usr/share/texmf/fonts/type1/bluesky/cm/cmsl10.pfb>
Output written on hello.pdf (2 pages, 22507 bytes).
Transcript written on hello.log.
[dhildreth@drh hello_world]$

Notice the text “Output written on hello.pdf (2 pages, 22507 bytes).”. That means it worked.

Step 4: View the New PDF File

Now, all you need to do is view the generated PDF document. Run the following command:

evince hello.pdf

Note, if you’re like me, you constantly flip between the source and the pdf. If that’s the case, evince will automatically update the pdf every time you re-build using pdflatex. Simply add “&” on the end of the command above:

evince hello.pdf &

This is what you should be seeing when completed:

Alternative: Rubber

I’d like to make a special note about the ‘rubber’ utility. It can take a lot of the guess work out of building LaTeX files and I strongly recommend using it over pdftex or pdflatex as mentioned above. You can install and run the ‘rubber’ utility with the following commands:

su -c 'yum install -y rubber'
rubber --pdf hello.tex

Conclusions

I hope this helps you get started moving quickly in the right direction. Comments and feedback are welcome.

A few difficulties I had when doing this for the first time were:

1
Aug

Installing lxml for Python and Run an Example

The goal of this guide is to install the lxml Python package and run the example cssutils Python script that uses the lxml package. This guide was written while running Ubuntu, but can be applied to any Linux distribution. Open up the terminal and let’s get started.

Step 1: Install easy_install For Python

The easy_install binary is a Python package manager that makes it easy to install cssutils. The following steps will get this installed:

sudo apt-get install curl
curl -O http://python-distribute.org/distribute_setup.py
sudo python distribute_setup.py

Step 2: Install Required Linux Packages for lxml

Now, we install the required Linux packages for installing lxml Python package.

sudo apt-get install libxml2-dev libxslt-dev python-dev

Step 3: Install the lxml and cssutils Python Packages

Next, install the lxml and cssutils Python packages using easy_install.

sudo easy_install lxml
sudo easy_install cssutils

Step 4: Prepare lxml Example

The following is a script copy/pasted from http://cssutils.googlecode.com/svn/trunk/examples/style.py. Copy paste this to your favorite text editor and save it. I named it lxml_test.py.

# -*- coding: utf-8 -*-
"""
example renderer

moves infos from external stylesheet "css" to internal @style attributes
and for debugging also in @title attributes.

adds css as text to html
"""
from pprint import pprint
import codecs
import cssutils
import os
import sys
import webbrowser

# lxml egg may be in a lib dir below this file (not in SVN though)
sys.path.append(os.path.join(os.path.dirname(__file__), 'lib'))
try:
    import pkg_resources
    pkg_resources.require('lxml')
except pkg_resources.DistributionNotFound, e:
    pass

try:
    from lxml import etree
    from lxml.builder import E
    from lxml.cssselect import CSSSelector
except ImportError, e:
    print 'You need lxml for this example:', e
    sys.exit(1)

def log(level, *msg):
    """print '%s- %s' % (level * 't ',
                      ' '.join((str(m) for m in msg)))"""

def getDocument(html, css=None):
    """
    returns a DOM of html, if css is given it is appended to html/body as
    pre.cssutils
    """
    document = etree.HTML(html)
    if css:
        # prepare document (add css for debugging)
        e = etree.Element('pre', {'class': 'cssutils'})
        e.text = css
        document.find('body').append(e)
    return document

def styleattribute(element):
    "returns css.CSSStyleDeclaration of inline styles, for html: @style"
    cssText = element.get('style')
    if cssText:
        return cssutils.css.CSSStyleDeclaration(cssText=cssText)
    else:
        return None

def getView(document, css, media='all', name=None,
            styleCallback=lambda element: None):
    """
    document
        a DOM document, currently an lxml HTML document
    css
        a CSS StyleSheet string
    media: optional
        TODO: view for which media it should be
    name: optional
        TODO: names of sheets only
    styleCallback: optional
        should return css.CSSStyleDeclaration of inline styles, for html
        a style declaration for ``element@style``. Gets one parameter
        ``element`` which is the relevant DOMElement

    returns style view
        a dict of {DOMElement: css.CSSStyleDeclaration} for html
    """
    sheet = cssutils.parseString(css)

    view = {}
    specificities = {} # needed temporarily 

    # TODO: filter rules simpler?, add @media
    rules = (rule for rule in sheet if rule.type == rule.STYLE_RULE)
    for rule in rules:
        for selector in rule.selectorList:
            log(0, 'SELECTOR', selector.selectorText)
            # TODO: make this a callback to be able to use other stuff than lxml
            cssselector = CSSSelector(selector.selectorText)
            matching = cssselector.evaluate(document)
            for element in matching:
                #if element.tag in ('div',):
                    # add styles for all matching DOM elements
                    log(1, 'ELEMENT', id(element), element.text)

                    if element not in view:
                        # add initial empty style declatation
                        view[element] = cssutils.css.CSSStyleDeclaration()
                        specificities[element] = {}                    

                        # and add inline @style if present
                        inlinestyle = styleCallback(element)
                        if inlinestyle:
                            for p in inlinestyle:
                                # set inline style specificity
                                view[element].setProperty(p)
                                specificities[element][p.name] = (1,0,0,0)

                    for p in rule.style:
                        # update style declaration
                        if p not in view[element]:
                            # setProperty needs a new Property object and
                            # MUST NOT reuse the existing Property
                            # which would be the same for all elements!
                            # see Issue #23
                            view[element].setProperty(p.name, p.value, p.priority)
                            specificities[element][p.name] = selector.specificity
                            log(2, view[element].getProperty('color'))

                        else:
                            log(2, view[element].getProperty('color'))
                            sameprio = (p.priority ==
                                        view[element].getPropertyPriority(p.name))
                            if not sameprio and bool(p.priority) or (
                               sameprio and selector.specificity >=
                                            specificities[element][p.name]):
                                # later, more specific or higher prio
                                view[element].setProperty(p.name, p.value, p.priority)

    #pprint(view)
    return view                        

def render2style(document, view):
    """
    - add style into @style attribute
    - add style into @title attribute (for debugging)
    """
    for element, style in view.items():
        v = style.getCssText(separator=u'')
        element.set('style', v)
        element.set('title', v)

def render2content(document, view, css):
    """
    - add css as <style> element to be rendered by browser
    - replace elements content with actual style

    result is a HTML which the browser renders itself from the original css
    cssutils only writes debugging, useful to compare with render2style
    """
    e = etree.Element('style', {'type': 'text/css'})
    e.text = css
    document.find('head').append(e)
    for element, style in view.items():
        v = style.getCssText(separator=u'')
        element.text = v

def show(text, name, encoding='utf-8'):
    "saves text to file with name and encoding"
    f = codecs.open(name, 'w', encoding=encoding)
    f.write(text)
    f.close()
    webbrowser.open(name)

def main():
    tpl = '''<html><head><title>style test</title></head><body>%s</body></html>'''
    html = tpl % '''
            <h1>Style example 1</h1>
            <p>&lt;p></p>
            <p style="color: red;">&lt;p> with inline style: "color: red"</p>
            <p id="x" style="color: red;">p#x with inline style: "color: red"</p>
            <div>a &lt;div> green?</div>
            <div id="y">#y pink?</div>
        '''
    css = r'''
        * {
            margin: 0;
            }
        body {
            color: blue !important;
            font: normal 100% sans-serif;
        }
        p {
            color: green;
            font-size: 2em;
        }
        p#x {
            color: black !important;
        }
        div {
            color: green;
            font-size: 1.5em;
            }
        #y {
            color: #f0f;
            }
        .cssutils {
            font: 1em "Lucida Console", monospace;
            border: 1px outset;
            padding: 5px;
        }
    '''
    # TODO:
    #defaultsheet = cssutils.parseFile('sheets/default_html4.css')

    # adds style to @style
    document = getDocument(html, css)
    view = getView(document, css, styleCallback=styleattribute)
    render2style(document, view)
    text = etree.tostring(document, pretty_print=True)
    show(text, '__tempinline.html')

    # replaces elements content with style
    document = getDocument(html)
    view = getView(document, css, styleCallback=styleattribute)
    render2content(document, view, css)
    text = etree.tostring(document, pretty_print=True)
    show(text, '__tempbrowser.html')

if __name__ == '__main__':
    import sys
    sys.exit(main())

Step 5: Run lxml Example

Finally, run the example program.

python lxml_test.py

Conclusions

I’m hoping this guide met the goal of teaching you how to install lxml and run an example Python script to demonstrate some capabilities of lxml and cssutils. Feedback is welcomed and appreciated.