Sunday, October 30, 2011

calibre resources

calibre is a free and open source community driven project. The software has many layers. The most commonly used functions like format conversion, metadata editing, news fetching etc have big clear icons and are very easy to use. Yet calibre is very powerful and offers a lot of flexibility by way of customization options through "Preferences". Some customization options were discussed in earlier blog posts and a lot more will be discussed in the future. For the more advanced users who know how to use regular expressions, create custom style sheets, and/or have some minimum knowledge of python, calibre offers even more flexibility. So calibre, although accessible to anyone for basic functionality, is a very versatile and powerful program.

For those who want to learn to exploit the more advanced or subtle features of calibre, a lot of help is available. I hope this article makes you aware of various resources associated with calibre so you can make the most of it.

Mobileread: calibre was originally developed with the support of the mobileread community. There is a dedicated calibre forum on this site.  Kovid himself as well as a lot of long time calibre users and developers are members of this forum. The members of this forum are very friendly and responsive to help requests and questions. Many of them have a lot of experience with using the various subtle features of calibre and some are technically proficient as well. The best way to contact Kovid is to post your questions here. Also if your questions are detailed or technical in nature, this is the best place to get help.

The calibre mobileread forum was divided into a number of sub forums a few months ago. The sub forums are (see figure below):

  • Recipes: Here you can post requests for news sources. If you have made a news recipe for some source that is not already included in calibre we would appreciate it if you shared it here. Little contributions from a lot of people is what makes calibre such a versatile software. You should put your name in the author name section of the recipe for credit and it has to be licensed GPLv3. You can also post improvements to existing recipes here. If some existing recipe has stopped working make a post about it and the author of the recipe or somebody who has time will look into it and fix it as soon as possible.
  • Devices: This sub forum deals with device specific issues. You can ask for support for new devices here. So if your question or problem concerns a particular device like the Kindle or iPhone, this is the place to post it. Those of you who use apple devices may want to check out this sticky.
  • Catalogs: If you have questions about creating a catalog or list of ebooks in your calibre library, this is the place to post. calibre allows you to format your catalog and make it suitable to be printed as well as read on e-readers. There are a number of settings that you can find out about here.
  • Conversion: calibre's conversion engine is sophisticated and has many features you may not have encountered yet. This is the place to get help on ways to customize the conversion process.
  • Library Management: calibre allows you to play around with the ebooks metadata and present it in various ways. Any questions about downloading metadata, editing it, or exporting it can be asked here. You can also ask about the tag browser and managing and viewing your ebooks.
  • Plugins: A lot of calibre's features are implemented by way of plugins. These are for less commonly used features that may be useful to specific subsections of people. This system makes calibre versatile, yet simple to use. This sub forum is for requesting new plugins or for getting help with existing plugins.
  • Development: This is for those of you who want to work on calibre and add to its capabilities. You can get help from Kovid and other developers to smooth the way for you. calibre relies on inputs from various developers who have made small and large code contributions over the years. Some of them have found working on calibre to be a fulfilling experience. One thing to remember is that although you have full credit for your code  patch all code submitted to calibre must be licensed GPLv3 and cannot be proprietary.

The screen names of the calibre mobileread moderators are Starson17, kovidgoyal, Piper_, GRiker, theducks, Manichean, kiwidude, ldolse, dwanthny, chaley, user_none. Many of them have made code and recipe contributions to calibre in addition to the help they provide by way of answering questions.

Mobileread also has a number of other forums for various book devices and ebooks and authors that you may find useful.

Facebook: For those of you with facebook accounts, calibre has a facebook fanpage which we use as a help forum. You can post any questions concerns or feedback you have for calibre. It will be addressed there or you will be directed to a more appropriate place for help. The facebook fanpage is managed by Krittika Goyal (me). I answer most of the questions so please be patient with me because sometimes I may be busy and wont get to a question for a few days. Also I may not know the answer to some questions and it takes me a while to find out.

Twitter: For those of you who tweet, calibre has a twitter page called "calibreforum", run by Kovid's father Niraj Goyal. However if you have a detailed question opt for one of the other help forums because the 140 character limit on twitter makes answering difficult.

Help forum etiquette: The volunteers in the various calibre help forums are very friendly. They are happy to help with questions of all kinds from the trivial to the very technical. However please be polite in your questions and patient with those helping you. This is the quickest way to get your questions answered. Please do not vent your frustrations on those who are volunteering their time to help you.

Self help: calibre has been well documented both for the benifit of it's developers as well as it's users. There is an extensive FAQ as well as detailed user manual. When you encounter a problem first check out the FAQ. If your question has been adressed in the FAQ you have a quick solution and you will save some time for the volunteers at the help forums.  There are also a number of helpful demos and video tutorials to be found here. With time we hope this blog will also contain useful articles that you can use as a reference.

Open Books: calibre is an open source software and we firmly believe in the open source philosophy. Digital Rights Management (DRM), in addition to being a source of inconvenience to users, is one of the roadblocks to exploiting all the features of calibre. While we don't believe in breaking the law, we do believe it needs to be changed. In our efforts to eradicate DRM, we have started "Open Books", a catalog of DRM free ebooks. The idea is to give publicity to DRM free ebooks and their authors as well as to provide calibre users with a large list of DRM free ebooks to choose from so they can truly use calibre to it's full potential without breaking any laws.

Open Books, a site for easy browsing of DRM-free e-books that are not in the public domain. Most public domain ebooks are available DRM free at the Project Gutenberg site. Open Books is a compilation DRM free e-books from various sources linked to enable readers to browse and download them.
Open Books now lists over 2730 books from over 30 stores and features over 1020 authors.

I hope you will make use of these resources to improve your calibre experience and some day become an active member of the calibre community contributing help at one of these forums. This article has been a digression from our usual format of tips and tricks but I hope you find it useful. Next week we will be back with a more usual article. Have a good week!

Friday, October 21, 2011

calibre's tag browser

The calibre tag browser is a versatile and powerful tool with some obvious and many obscure but extremely useful features. Today I will discuss some of these features. The figure on the left below shows the part of the main calibre window which is the "tag browser". One of the more obscure features I will discuss is how to make subgroups in the tag browser as shown in the portion highlighted in blue on the right figure below.

Show ebooks with tag: To explore a particular category in the tag browser, for example "Tags", click the little arrow next to it (Tags) and a list of all the tags will appear. Suppose you are interested in ebooks tagged "Romance" then click on it and a green "+" sign appears next to it indicating that the ebooks  now displayed in the main calibre window are tagged "Romance". The tag browser allows you to choose multiple tags simultaneously. For example, while the "Romance" tag has been clicked, if you hold down the "Ctrl" key on your keyboard and select another tag, say "Family Life" then the green "+" sign appears next to both the tags as shown n the figure below. Now at the bottom right corner of the window in the second entry if you see "Match any", then the main calibre window will display all ebooks that have the tag "Romance" as well as all ebooks that have the tag "Family Life" as shown in the figure below. However if you click on "Match any" a menu shows up with another option; namely "Match all". If you select "Match all" then the main calibre window will display only those ebooks that have both tags and unlike the figure below only "Pride and Prejudice by Jane Austen", which has both tags "Romance" and "Family Life" will be displayed. Here you could have chosen the two tags from different categories for example "Humour" (from Tags) and "George Bernard Shaw" (from Authors) and "Match all" to get all ebooks tagged "Humour" by George Bernard Shaw or "Match any" to get all ebooks tagged "Humour" as well as all ebooks by George Bernard Shaw.

Show all ebooks w/o tag: Now instead of displaying all ebooks with a particular tag, you may want to display all ebooks that do not have a particular tag. For example, if you want to see all your ebooks except the ones written by George Bernard Shaw, click the little arrow next to the "Authors" category to get the list of authors and then click on George Bernard Shaw. A little green "+" sign appears next to it. Now click again on George Bernard Shaw. This time a little red "-" sign appears next to it indicating that the ebooks displayed in the main calibre library include all your ebooks except those by George Bernard Shaw. By holding down the "Ctrl" key you can select multiple entries. After having eliminated all ebooks by George Bernard Shaw if you hold down the "Ctrl" key and left click twice on "calibre" a red "-" sign appears next to "calibre" as well as shown in the figure below and in the list of ebooks displayed in your main calibre window you will see your entire ebook collection except those authored by either George Bernard Shaw or calibre.

Sorting tags: Calibre allows you to sort the tags in the tag browser in many different ways. For example the list of authors in the above figure are displayed in alphabetical order of their last names. This is because, the first entry in the bottom left corner is set to "Sort by name". However if you click on it it gives you a menu with a number of options. If you select the option "Sort by popularity" instead, then the authors will be displayed in descending order of the number of ebooks by them in your library. You can also choose "Sort by rating" in which case the authors will appear in decreasing order of their average rating (by stars assigned to their ebooks in the ratings column).

Making subgroups of tags: The top figure on the right shows a subgroup of tags to the main tag "Classics". To create such subgroups you need to enable this feature. To do this, go to Preferences -> Look and Feel and click on the "Tag Browser" tab. Here in the field for "Categories with hierarchical items" enter "tags" and click "Apply" in the top left hand corner. Now to actually get hierarchical tags, in the tags section of the metadata entry enter a tag of the form MainTag.SubTag. Like in the top figure on the right and example would be Classics.Russian or Classics.English.

Managing categories and tags: If you right click on a particular category in the tag browser a menu appears. The first option allows you to hide the category so you can reduce clutter if you are not interested in it. The fourth option allows you to manage and create user categories. Say, two people John and Joe use the same calibre library. But Joe would like to maintain a separate list of his tags so he only has to search through his own. To do this, right click on a category in the tag browser, choose the "Manage User Categories" option, a new window opens as shown in the left figure below. Enter the title say "Joe's tags" in the right hand top corner and click the green "+" sign. Now you have the new user category called "Joe's tags" that shows up in the left hand top corner as in the figure below. You can now select the tags on the left and use the blue arrow to copy those tags into the "Joe's tags" category. When you are finished with copying the tags click "OK" and now the new category with it's tags appear in the tag browser as shown in the right figure below. Now if you right click on this user category, you willsee an option that allows you to create sub categories. In addition if you right click on any tag in the tag browser, you will see the option to manage tags, that allows you to rename and delete tags.

Hope you found this post useful. See you in about a week.

Friday, October 14, 2011

ebook format conversion

A lot of you have used calibre to convert your ebooks between different formats. The calibre conversion system is very sophisticated and has many subtleties. Today I will discuss just a few features that are often overlooked and yet a lot of you may find very useful. The following is the "Convert books" icon in the top main calibre tool bar:

Bulk convert:
Calibre allows you to convert books in bulk. You can select a set of books, all of which may or may not be in the same format, and convert them all in bulk to a different format. So say you have a kindle and you have a number of books in EPUB, RTF and PDF formats. You can select them all. Then click the "Convert books" icon or click the little arrow next to the "Convert books" icon and choose the option "Bulk convert". A new window  opens. At the top right corner of the window you can set the output format, which in this case would be MOBI.

calibre allows you to choose various settings for the conversion output. A few of these are discussed below. When you select a book and click the "Convert books" icon  in the top tool bar a new window opens. On the left of the window is a menu with a number of entries.

Smart Punctuation: The second entry in the menu is "Look and Feel". When you select it on the right you will see a number of options to customize the look and feel of the output of the conversion. Under the section called "Text justification" there is an entry called "Smarten punctuation". When this is selected before the conversion, all plain quotes, dashes and ellipsis are converted to their typographically correct equivalents. For example, plain quotes to curly quotes. You can do the reverse by clicking "Unsmarten punctuation"

Page Setup: The fourth entry in the menu is "Page Setup". This option allows you to optimize the conversion output for a particular device, specifically the image sizes so they fit well on the screen of the device. So say you have a Kindle, then you would choose "Kindle" in the list of "Output Profiles" that appear when "Page Setup" is selected. Say you have converted all your ebooks for your Kindle and stored them that way, and now you have bought an iPad. To optimize the output for ebooks converted to EPUB for the iPad from MOBI for the Kindle, select "Kindle" in the list of "Input Profiles" and "iPad" in the list of "Output Profiles".

Covers and Metadata: The fifth entry in the menu is "Structure Detection". When you select it on the right side you will see the option called "Insert metadata as page at start of book". Selecting it will include the metadata at the beginning of your ebook itself, including reviews if they are available. Suppose you do not want to convert the ebook to a different format but want to include the metadata and cover within the ebook, then just select the same formats for the input and output during conversion with the "Insert metadata as page at start of book" option checked. See figure below for an example. The metadata along with a choice of covers for an ebook can be obtained by selecting the ebook and clicking the "Edit metadata" button in the top main calibre tool bar.

Some ebooks have an image as a cover which is not explicitly marked as the cover. Instead it is just another image in the ebook. In this case when you convert the book in calibre, the output may have two covers. If this happens, re-do the conversion with the "Remove first image" option (above the "Insert metadata as page at start of book") selected. This will give you an ebook with just one cover.

MOBI Table of Contents: For the specific case of converting to MOBI, the eighth option on the left menu is "MOBI output". For MOBI output calibre generates a Table of contents (TOC), which is inserted by default at the end of the ebook. When the "MOBI output" option is selected in the left menu you can choose "Do not add Table of Contents to book" if you do not want the TOC or choose "Put generated Table of Contents at start of book instead of end" if you want the TOC at the beginning of the book. The TOC calibre generates is hyperlinked to the chapters.

Saturday, October 8, 2011

Custom news fetching

The "Fetch News" feature in calibre is very powerful and versatile. It has over a 1000 built in news recipes, spanning over 30 languages and 50 countries, for websites of newspapers, magazines and blogs. At the click of a few buttons you can set up regular downloads of the news source of your choice. Not only can you choose from this large variety of available recipes, but you can also, create a customized recipe of your own or tweak one of the available recipes to make it better suited to you. In the past creating pretty news recipes has been a job for those with some expertise in python. However, thanks to the "auto clean up" feature recently included in calibre, for many a website or combination of websites, anyone can follow a simple formula to write a recipe. This may not work for a small fraction of complicated websites but is well worth a try. The "Fetch News" icon in the top calibre toolbar is shown in the figure below.

Basic news fetching: To get started just click on the above icon in the top toolbar in the main calibre window. A new window, shown in Figure: 1 below, opens up. It lists all available news sources by language and country. For example "Arabic[2]" on the second line indicates there are 2 news souces available in the Arabic language. To see what these news sources are, click the little arrow on the left of "Arabic". You will notice in case of English there are a number of entries. This is because a number of different countries publish English newspapers, magazines and blogs. The list indicates there are 7 news sources from australia, 1 from Bulgaria, 23 from Canada and so on.

If you are looking for a particular news source, say the "Washington Post", then you can search for it as shown in Figure: 2 above. Type in all or some the letters in the search bar and press enter the list below will reduce to include only matches to your search, so as shown in Figure: 2 above only 3 news sources match up to the word "washington". Now if you select "The Washington Post", options appear on the right side that allow you to download the news immediately (bottom right, "Download Now" button), or to set up a schedule for automatic downloads. The automatic download will occur the first time calibre is run on your computer after the scheduled download time.

Adding RSS feeds to existing news recipes: There are some very simple things you can do to customize the news recipes to your taste.
News recipes contain different RSS feeds for different sections like Sports, Politics, Business etc. You may be interested in a feed that is not included like say Entertainment.
It is very simple to add this feed of interest to you.

First go to the news website of interest. Search the page for RSS (RSS feeds) indicated usually by a little rectangular orange icon. Now click on this link and it should take you to a page with RSS links to various sections of the news like Sports, Politics Business etc.

Startup calibre and click the little arrow on the fetch news and click on "Add a custom news source". A new window opens up and on the bottom left corner click on "Customize builtin recipe". Now a little window opens up where you can pick the recipe of the news scource you wish to customize.

Example : The Los Angeles Times

Now on the left column of the previous window The Los Angeles Times should show up. Select it and the recipe will show up on the right column.

If you look at the recipe you will find a block with RSS feeds that begins with

    "feeds = [ ...

Instead suppose it had looked like (with fewer feeds and not including the feed on the Sports which say you are intersted in)

    feeds = [
              (u'Top News'             , u''                           )
             ,(u'Local News'           , u''                     )
             ,(u'National'             , u''        )
             ,(u'National Politics'    , u''                 )
             ,(u'Business'             , u''                       )
             ,(u'Education'            , u''                 )
             ,(u'Environment'          , u''       )
             ,(u'Religion'             , u''              )
             ,(u'Science'              , u''                   )
             ,(u'Technology'           , u''                     )
             ,(u'Africa'               , u''                         )

Then all you have to do is follow the previous syntax and add in the name and get the link from the page of the website of The Los Angeles Times with the RSS feeds corresponding to Sports.

Make sure the link you get looks similar to the other links in the recipe. If not try to copy the link from the little square orange icon.

If there are RSS feeds corresponding to sections that do not interest you, you can delete the names and links corresponding to those sections. This will make the download process faster an remove clutter.

Then click the Add/Update recipe button at the bottom left corner. Now a new "replace recipe?" window opens up. Click replace recipe and you are done!

To access this recipe go to the main calibre window and click "Fetch News" and you get a list of news sources. The first entry is Custom. Click on it and it will expand to show the list of your customized news sources.

Auto clean up: This is a powerful feature that enables lay users to make custom recipes. You may be interested in making a single news recipe that has RSS feeds from different blogs and news sources that you visit. This can be done quite easily with "Auto clean up". The following recipe obtains the RSS feeds for the politics section of 3 different news sources, namely, "The Seattle Times", "The San Francisco Chronicle" and "The Los Angeles Times":

from import BasicNewsRecipe

class Politics(BasicNewsRecipe):
    title          = u'Politics'
    language       = 'en'
    __author__     = 'Krittika Goyal'
    oldest_article = 3 #days
    max_articles_per_feed = 20
    use_embedded_content = False

    no_stylesheets = True
    auto_cleanup = True
    auto_cleanup_keep = '//div[@class="thumbnail"]'

    feeds          = [
('Seattle Times',
('San Francisco Chronicle',
('LA tIMES',

The first line in red must be in every news recipe. The next block of code in grey is information like title author etc, which you should change to suit your recipe.  The next two lines in red are what is used to clean up the web page, remove advertisements and other unwanted material. The "auto_cleanup" uses statistical analysis to extract the useful content in the news website or blog. I will return to the blue line later. The next grey block of code includes the feeds of interest. The output you get (w/o the blue line) for one of the pages of "The San Francisco Chronicle" is shown in the figure below.

"auto_cleanup" is usually great at picking out the relevant content from a variety of websites, so you do not need to manually clean up each website. As a result you can use feeds from different websites even if the articles have very different structures. Sometimes however, "auto_cleanup" can be over zealous and remove content that is indeed relevant like a picture. To fix this you need to understand a little bit of HTML. You need to use "firebug" in firefox or a similar tool to find out the tags corresponding to the part that you would like to remain.

In the above example "auto_cleanup" was removing the picture at the beginning of the articles in "The Los Angeles Times".  To fix that we had to add the blue line of code. Now the picture is included (see figure below).

Without the blue line of code, you would still get the text of the article but the picture would be missing. The "auto_cleanup" feature is based on code from the ReadItLater open source project.

For the more advanced user: Finally for those of you who are adventurous or experienced in programming, the customizing news feature in calibre is very powerful and for tips on using it to its full potential visit the "tips for developing new recipes" section of the calibre user manual.

The great thing about calibre is that its features are accessible at many levels so both lay users as well as advanced tinkerers find it useful and enjoyable.

Finally sorry for this blog post being later than usual, but Kovid and I moved to India today and the moving procedure kept me very busy. We still need to settle in so the next few blog posts may be a little off schedule as well and i may e a little slow in responding to comments. I will do my best to be on time. As always see you in about a week and hope you found this post useful.