Posted by Derek@TheDailyLinux »
1 Comment »
If you have an extra folder or file within a tarball (.tar) file, you can remove it without extracting the entire tarball first. This can be really handy when you have a massive file that you don’t want to spend a lot of time extracting and re-archiving. Open up a terminal shell and follow along with the example below.
We’ll first see what’s in the compressed tarball named sandbox.tar.gz by using the --list option of tar:
thelinuxdaily$ tar --list --file=sandbox.tar.gz
sandbox/
sandbox/delete_me/
sandbox/delete_me/hello.txt
sandbox/hello.txt
sandbox/hello2.txt
sandbox/hello3.txt
sandbox/save_me/
sandbox/save_me/hello.txt
Let’s try to delete the folder called sandbox/delete_me from the compressed tarball:
thelinuxdaily$ tar --delete --file=sandbox.tar.gz sandbox/delete_me
tar: Cannot update compressed archives
tar: Error is not recoverable: exiting now
See what happened? That means we need to uncompress it. In this case, it’s a .gz compression type, so we’ll use gunzip (if you were using bz2, you’d use bunzip2:
thelinuxdaily$ gunzip sandbox.tar.gz
Let’s try to delete the folder sandbox/delete_me again:
thelinuxdaily$ tar --delete --file=sandbox.tar sandbox/delete_me
We didn’t get an error message, so that’s good. Let’s see if it’s gone by using --list again:
thelinuxdaily$ tar --list --file=sandbox.tar
sandbox/
sandbox/hello.txt
sandbox/hello2.txt
sandbox/hello3.txt
sandbox/save_me/
sandbox/save_me/hello.txt
thelinuxdaily$ gzip sandbox.tar
Excellent! That’s what we needed to see. If you have any other tips, feel free to use the comments below.
Posted by Derek@TheDailyLinux »
Add Comment »
It all started with an email to a mailing list about an operating system that wasn’t going to be “big and professional like gnu”. So, happy birthday to Linux as of 25 Aug 91 20:57:08 GMT. Thanks for the good times so far!
Be sure to check out the special 20th anniversary page on The Linux Foundation’s page here:
http://www.linuxfoundation.org/20th/
Posted by Derek@TheDailyLinux »
Add Comment »
Most of the guides I came across were just wildly all over the place, so I figured a short and simple guide that gets right to the point would be useful. Here’s how to set up Fedora 15 to build (compile might be another word) a .tex file into a .pdf file from the command line. It can be applied to other Linux distributions as well. There are GUIs to help with this, but it’s best to start with the basics. So, fire up a terminal and let’s get started.
Step 1: Install the TexLive Package
The first thing we need to do is install the texlive package.
su -c 'yum install -y texlive'
Step 2: Prepare a .tex Document
Next, we’ll create a “hello world” type .tex file. Fire up your favorite text editor and copy/paste the following (I enjoy using vim since I can stay in the command line while building the LaTeX file). When you’re finished, save the file (I’ve saved it as “hello.tex”).
documentclass{book}
usepackage{lipsum}
begin{document}
chapter{Sample}
lipsum[1-4]
end{document}
Step 3: Build the .tex Document Into a PDF File
Finally, we’ll build (compile) the .tex source into a pretty PDF document using the command below:
pdflatex hello.tex
You’ll see something similar to this:
[dhildreth@drh hello_world]$ pdflatex hello.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
(./hello.tex
LaTeX2e <2005/12/01>
Babel <v3.8h> and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, arabic, basque, bulgarian, coptic, welsh, czech, slovak, german, ng
erman, danish, esperanto, spanish, catalan, galician, estonian, farsi, finnish,
french, greek, monogreek, ancientgreek, croatian, hungarian, interlingua, ibyc
us, indonesian, icelandic, italian, latin, mongolian, dutch, norsk, polish, por
tuguese, pinyin, romanian, russian, slovenian, uppersorbian, serbian, swedish,
turkish, ukenglish, ukrainian, loaded.
(/usr/share/texmf/tex/latex/base/book.cls
Document Class: book 2005/09/16 v1.4f Standard LaTeX document class
(/usr/share/texmf/tex/latex/base/bk10.clo))
(/usr/share/texmf/tex/latex/lipsum/lipsum.sty)
No file hello.aux.
Chapter 1.
[1{/usr/share/texmf/fonts/map/pdftex/updmap/pdftex.map}] [2] (./hello.aux) )</u
sr/share/texmf/fonts/type1/bluesky/cm/cmbx12.pfb></usr/share/texmf/fonts/type1/
bluesky/cm/cmr10.pfb></usr/share/texmf/fonts/type1/bluesky/cm/cmsl10.pfb>
Output written on hello.pdf (2 pages, 22507 bytes).
Transcript written on hello.log.
[dhildreth@drh hello_world]$
Notice the text “Output written on hello.pdf (2 pages, 22507 bytes).”. That means it worked.
Step 4: View the New PDF File
Now, all you need to do is view the generated PDF document. Run the following command:
evince hello.pdf
Note, if you’re like me, you constantly flip between the source and the pdf. If that’s the case, evince will automatically update the pdf every time you re-build using pdflatex. Simply add “&” on the end of the command above:
evince hello.pdf &
This is what you should be seeing when completed:
Alternative: Rubber
I’d like to make a special note about the ‘rubber’ utility. It can take a lot of the guess work out of building LaTeX files and I strongly recommend using it over pdftex or pdflatex as mentioned above. You can install and run the ‘rubber’ utility with the following commands:
su -c 'yum install -y rubber'
rubber --pdf hello.tex
Conclusions
I hope this helps you get started moving quickly in the right direction. Comments and feedback are welcome.
A few difficulties I had when doing this for the first time were:
- “Where’s the ‘latex’ package??”
- Turns out, texlive is the package to install these days.
- Undefined control sequence message:
[dhildreth@drh hello_world]$ pdftex small2e.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
(./small2e.tex
! Undefined control sequence.
l.11 documentclass
{article} % Your input file must contain these two...
?
- Apparently, there’s a difference between pdflatex and pdftex. Understand this when moving forward. I would recommend using the command
rubber --pdf hello.tex instead since it takes care of the decision for you (as well as offers many other benefits — man rubber for more info).
Posted by Derek@TheDailyLinux »
Add Comment »
The goal of this guide is to install the lxml Python package and run the example cssutils Python script that uses the lxml package. This guide was written while running Ubuntu, but can be applied to any Linux distribution. Open up the terminal and let’s get started.
Step 1: Install easy_install For Python
The easy_install binary is a Python package manager that makes it easy to install cssutils. The following steps will get this installed:
sudo apt-get install curl
curl -O http://python-distribute.org/distribute_setup.py
sudo python distribute_setup.py
Step 2: Install Required Linux Packages for lxml
Now, we install the required Linux packages for installing lxml Python package.
sudo apt-get install libxml2-dev libxslt-dev python-dev
Step 3: Install the lxml and cssutils Python Packages
Next, install the lxml and cssutils Python packages using easy_install.
sudo easy_install lxml
sudo easy_install cssutils
Step 4: Prepare lxml Example
The following is a script copy/pasted from http://cssutils.googlecode.com/svn/trunk/examples/style.py. Copy paste this to your favorite text editor and save it. I named it lxml_test.py.
# -*- coding: utf-8 -*-
"""
example renderer
moves infos from external stylesheet "css" to internal @style attributes
and for debugging also in @title attributes.
adds css as text to html
"""
from pprint import pprint
import codecs
import cssutils
import os
import sys
import webbrowser
# lxml egg may be in a lib dir below this file (not in SVN though)
sys.path.append(os.path.join(os.path.dirname(__file__), 'lib'))
try:
import pkg_resources
pkg_resources.require('lxml')
except pkg_resources.DistributionNotFound, e:
pass
try:
from lxml import etree
from lxml.builder import E
from lxml.cssselect import CSSSelector
except ImportError, e:
print 'You need lxml for this example:', e
sys.exit(1)
def log(level, *msg):
"""print '%s- %s' % (level * 't ',
' '.join((str(m) for m in msg)))"""
def getDocument(html, css=None):
"""
returns a DOM of html, if css is given it is appended to html/body as
pre.cssutils
"""
document = etree.HTML(html)
if css:
# prepare document (add css for debugging)
e = etree.Element('pre', {'class': 'cssutils'})
e.text = css
document.find('body').append(e)
return document
def styleattribute(element):
"returns css.CSSStyleDeclaration of inline styles, for html: @style"
cssText = element.get('style')
if cssText:
return cssutils.css.CSSStyleDeclaration(cssText=cssText)
else:
return None
def getView(document, css, media='all', name=None,
styleCallback=lambda element: None):
"""
document
a DOM document, currently an lxml HTML document
css
a CSS StyleSheet string
media: optional
TODO: view for which media it should be
name: optional
TODO: names of sheets only
styleCallback: optional
should return css.CSSStyleDeclaration of inline styles, for html
a style declaration for ``element@style``. Gets one parameter
``element`` which is the relevant DOMElement
returns style view
a dict of {DOMElement: css.CSSStyleDeclaration} for html
"""
sheet = cssutils.parseString(css)
view = {}
specificities = {} # needed temporarily
# TODO: filter rules simpler?, add @media
rules = (rule for rule in sheet if rule.type == rule.STYLE_RULE)
for rule in rules:
for selector in rule.selectorList:
log(0, 'SELECTOR', selector.selectorText)
# TODO: make this a callback to be able to use other stuff than lxml
cssselector = CSSSelector(selector.selectorText)
matching = cssselector.evaluate(document)
for element in matching:
#if element.tag in ('div',):
# add styles for all matching DOM elements
log(1, 'ELEMENT', id(element), element.text)
if element not in view:
# add initial empty style declatation
view[element] = cssutils.css.CSSStyleDeclaration()
specificities[element] = {}
# and add inline @style if present
inlinestyle = styleCallback(element)
if inlinestyle:
for p in inlinestyle:
# set inline style specificity
view[element].setProperty(p)
specificities[element][p.name] = (1,0,0,0)
for p in rule.style:
# update style declaration
if p not in view[element]:
# setProperty needs a new Property object and
# MUST NOT reuse the existing Property
# which would be the same for all elements!
# see Issue #23
view[element].setProperty(p.name, p.value, p.priority)
specificities[element][p.name] = selector.specificity
log(2, view[element].getProperty('color'))
else:
log(2, view[element].getProperty('color'))
sameprio = (p.priority ==
view[element].getPropertyPriority(p.name))
if not sameprio and bool(p.priority) or (
sameprio and selector.specificity >=
specificities[element][p.name]):
# later, more specific or higher prio
view[element].setProperty(p.name, p.value, p.priority)
#pprint(view)
return view
def render2style(document, view):
"""
- add style into @style attribute
- add style into @title attribute (for debugging)
"""
for element, style in view.items():
v = style.getCssText(separator=u'')
element.set('style', v)
element.set('title', v)
def render2content(document, view, css):
"""
- add css as <style> element to be rendered by browser
- replace elements content with actual style
result is a HTML which the browser renders itself from the original css
cssutils only writes debugging, useful to compare with render2style
"""
e = etree.Element('style', {'type': 'text/css'})
e.text = css
document.find('head').append(e)
for element, style in view.items():
v = style.getCssText(separator=u'')
element.text = v
def show(text, name, encoding='utf-8'):
"saves text to file with name and encoding"
f = codecs.open(name, 'w', encoding=encoding)
f.write(text)
f.close()
webbrowser.open(name)
def main():
tpl = '''<html><head><title>style test</title></head><body>%s</body></html>'''
html = tpl % '''
<h1>Style example 1</h1>
<p><p></p>
<p style="color: red;"><p> with inline style: "color: red"</p>
<p id="x" style="color: red;">p#x with inline style: "color: red"</p>
<div>a <div> green?</div>
<div id="y">#y pink?</div>
'''
css = r'''
* {
margin: 0;
}
body {
color: blue !important;
font: normal 100% sans-serif;
}
p {
color: green;
font-size: 2em;
}
p#x {
color: black !important;
}
div {
color: green;
font-size: 1.5em;
}
#y {
color: #f0f;
}
.cssutils {
font: 1em "Lucida Console", monospace;
border: 1px outset;
padding: 5px;
}
'''
# TODO:
#defaultsheet = cssutils.parseFile('sheets/default_html4.css')
# adds style to @style
document = getDocument(html, css)
view = getView(document, css, styleCallback=styleattribute)
render2style(document, view)
text = etree.tostring(document, pretty_print=True)
show(text, '__tempinline.html')
# replaces elements content with style
document = getDocument(html)
view = getView(document, css, styleCallback=styleattribute)
render2content(document, view, css)
text = etree.tostring(document, pretty_print=True)
show(text, '__tempbrowser.html')
if __name__ == '__main__':
import sys
sys.exit(main())
Step 5: Run lxml Example
Finally, run the example program.
python lxml_test.py
Conclusions
I’m hoping this guide met the goal of teaching you how to install lxml and run an example Python script to demonstrate some capabilities of lxml and cssutils. Feedback is welcomed and appreciated.