Review of “What is EPUB3?” by Matt Garrish (O’Reilly Media)


What is EPUB3 ? - CoverWho may explain what EPUB 3 is better than the chief editor of EPUB 3 suite? Matt Garrish describes the new features of EPUB 3, compared with EPUB 2, and the requirements behind the new format’s version.

The Garrish’s work is very impressive. He describe each of the EPUB 3 feature deeply, listing the difference between others format and reporting the rationale behind every choice.

Although he describes the benefits of the new format, Garrish repeats all the time that the new features will be supported by the new eReader device so an ebook publisher must think about “old” devices.

Who may read this book.

The book is suitable for people who will write a book with EPUB 3. The book describes the new format’s feature, like multimedia support, so also who will explore new format for electronic publishing, like comics or multimedia books may found this book useful.

How the book is structured.

The book have four main sections:

  • EPUB 3 in a Nutshell: where the author describes what is an EPUB and the idea behind EPUB 3;
  • The EPUB 3 Revision: where there are listed the differences between EPUB 2 and EPUB 3. The author explains how the new format goes beyond the limitations and defects of its predecessor;
  • EPUB and Web Standards: EPUB 3 is based on the most important web standard, like XHTML, CSS, and JavaScript. In this brief section the author describe why and how EPUB was hitched to the web standards wagon;
  • The Goodies: in this section Garrish describes the new format’s features like:
    • Multimedia, Media overlays and Graphic content;
    • Scripting;
    • Globalization & Accessibility.
Book data sheet.

Title: What Is EPUB 3?
Author: Matt Garrish
Publisher: O’Reilly Media
Print: N/A
Ebook: September 2011
Pages: 21
Print ISBN: N/A
Ebook ISBN: 978-1-4493-1454-5

I review for the O'Reilly Blogger Review Program This review was made as part of the O’Reilly Blogger Review Program.

O’Reilly give me a free copy of the book.

Review of “Big Data Glossary” by Pete Warden (O’Reilly Media)


Pete Warden write a brief, less than fifty pages, but complete review of “Big Data” with more than sixty “terms” described. Big Data Glossary - Cover For each term Warden shares with us a little bit of his experience with big data, with some suggestions about when you may use the subject described.

Who may read this book.

The book is good starting point to who have to deal with big data. As a glossary is supposed to be, each term is not described in deep, but it reports some hints about similar, or ancestor, tools and suggests when you may found useful explore that tool. Experienced people may found the description of a well know term too brief, but the glossary is so huge that they can found new tools to investigate.

In my opinion the book lacks a complete references list, but a short internet search may set aside that defect.

As one may suppose, most of the terms within the glossary comes from Google, Yahoo, Linkedin or Facebook labs and they are supported by Apache Foundation. Surprisingly, at least for me, often Java and Javascript are the languages used by the described tools.

How the book is structured.

The book is made of eleven chapters. The first chapter introduces some base terms (like Document-Oriented,  Key/Value, MapReduce, Sharding) that will be widely used through the rest of the book.

The second and the third chapters list the terms related to how to access to  big data, with NoSQL Database or MapReduce approach.

The chapters four and five describe where to store big data, storage (file systems) and servers. Most of the services and systems listed here are based on cloud computing.

Chapters from six to eight contain terms related to big data processing, like natural language processing or machine learning.

Chapter nine lists some tools or API useful to visualize big data set via graph, map or table.

Chapter ten suggests some tools useful to cope with big data set acquisition. Often  dataset are manually created or are unstructured, like web pages, so the chapter is focused on data clean up and automatic data extraction.

Serialization is the subject of the last chapter, where is described how to save data or send them across the network.

Book data sheet.


Title: Big Data Glossary
Author: Pete Warden
Publisher: O’Reilly Media
Print: September 2011
Ebook: September 2011
Pages: 60
Print ISBN: 978-1-4493-1459-0
Ebook ISBN: 978-1-4493-1458-3

I review for the O'Reilly Blogger Review Program This review was made as part of the O’Reilly Blogger Review Program.

O’Reilly give me a free copy of the book.

Hello world!


Hi at all.

Per un pugno di fagioli“, in Italian stand for “for a fist of beans”, is my tech blog where I will share what I fall through while I’m working, in first instance with java. I will keep it like a diary where spend my two cents, without the aim to discover silver bullets.

As the blog title says, I’m Italian, but I will try to write in English, the developer lingua franca.

I hope that this blog will be useful for someone, nether less it will be useful for myself :) .

See you soon.

How to implement the Copy&Paste for a JTable


The standard implementation of JTable allows to copy from the clipboard, via Ctrl+C, but not to paste to the clipboard, via Ctrl+V. Also we can copy only a full row, not a single cell, in a tab-separated or in a HTML flavour (The format depends by the target application, if it supports a rich format like HTML or less).

But how we can force JTable to act like a spreadsheet? Where the Ctrl+V was supported and the Ctrl+C copy the value of the single selected cell?

The copy action

First of all we have to override the standard behaviour:

final JTable table;
...
ActionListener listener = new ActionListener() {
 public void actionPerformed(ActionEvent event) {
 doCopy();
 }//end actionPerformed(ActionEvent)
};

final KeyStroke stroke = KeyStroke.getKeyStroke(KeyEvent.VK_C, ActionEvent.CTRL_MASK, false);

table.registerKeyboardAction(listener, "Copy", stroke, JComponent.WHEN_FOCUSED);

So, when we press Ctrl+C we call the doCopy() method of the parent class of the anonymous listener.

private void doCopy() {
    int col = table.getSelectedColumn();
    int row = table.getSelectedRow();
    if (col != -1 && row != -1) {
        Object value = table.getValueAt(row, col);
        String data;
        if (value == null) {
            data = "";
        } else {
            data = value.toString();
        }//end if

        final StringSelection selection = new StringSelection(data);     

        final Clipboard clipboard = Toolkit.getDefaultToolkit().getSystemClipboard();
        clipboard.setContents(selection, selection);
    }//end if
}//end doCopy()

The clipboard

The clipboard, java.awt.datatransfer.Clipboard, is the component that allows us to interact with the System Clipboard. In Java there is the concept of Data Flavor. The Data Flavor[1] permit to handle different format, ie if a client may understand a rich format like HTML or RTF it may use the content as is, otherwise it may use it as a bare text.
Usually we may access to the system clipboard via the AWT system toolkit, Toolkit.getDefaultToolkit().getSystemClipboard(), but we may create our own clipboard due to limit the copy&paste visibility only for the local VM or to extend it between some remote machines.

The paste action

As for the copy action we have to register[2] the keyboard action:

final JTable table;     
...      
ActionListener listener = new ActionListener(){
    public void actionPerformed(ActionEvent event) {
        doPaste();
    }//end actionPerformed(ActionEvent)
};      

final KeyStroke stroke = KeyStroke.getKeyStroke(KeyEvent.VK_V, ActionEvent.CTRL_MASK, false);
table.registerKeyboardAction(listener, "Paste", stroke, JComponent.WHEN_FOCUSED);

In the table we suppose that the data are stored as String, is trivial handle Number, Date, etcetera etcetera.

private void doPaste() {
    final Clipboard clipboard = Toolkit.getDefaultToolkit().getSystemClipboard();
    final Transferable content = clipboard.getContents(this);
    if (content != null) {
        try {
            final String value = content.getTransferData(DataFlavor.stringFlavor).toString();     

            final int col = table.getSelectedColumn();
            final int row = table.getSelectedRow();
            if (table.isCellEditable(row, col)) {
                table.setValueAt(value, row, col);
                if (table.getEditingRow() == row && table.getEditingColumn() == col) {
                    final CellEditor editor = table.getCellEditor();
                    editor.cancelCellEditing();
                    table.editCellAt(row, col);
                }//end if
            }//end if
            table.repaint();
        } catch (UnsupportedFlavorException e) {
            // String have to be the standard flavor
            System.err.println("UNSUPPORTED FLAVOR EXCEPTION " + e.getLocalizedMessage());
        } catch (IOException e) {
            // The data is consumed?
            System.err.println("DATA CONSUMED EXCEPTION " + e.getLocalizedMessage());
        }//end try
    }//end if
}//end doPaste()

The first step is to get the clipboard content as a Transferableobject. If this contents is null the clipboard is empty and we have no data to copy. Now we have to translate the content in a format that we can understand. Because we handle String, we ask the data in a String Flavor. Almost every content may be translated into String, but we may ask at the Transferable object the supported flavor in order to obtain a more rich one.

As last step we may copy the clipboard value into the first editable selected cell.

[1] In UK English Flavour, in US English Flavor. This may be the source of some typo errors.
[2] The registerKeyboardAction of JComponent is described as an obsolete method, even it is not marked by a @obsolete tag. This method is a shorthand for:

public void registerKeyboardAction(ActionListener anAction,String aCommand,KeyStroke aKeyStroke,int aCondition) {
    final InputMap inputMap = getInputMap(aCondition, true);
    if (inputMap != null) {
        final ActionMap actionMap = getActionMap(true);
        final ActionStandin action = new ActionStandin(anAction, aCommand);
        inputMap.put(aKeyStroke, action);
        if (actionMap != null) {
            actionMap.put(action, action);
        }
    }
}

Hello World minimo


Si dice che un linguaggio non sia general purpose se richede un alto numero di istruzioni per scrivere a video Hello World.

Ma in java, di quanti caratteri è composta la più piccola classe che scrive a video “Hello World”? La risposta è 21.

La classe B detiene questo record.

public class B extends A;

Questo è possibile grazie al fatto che eredita dalla classe A tutti i metodi public e protected e quindi anche il metodo main.

public class A { 
  public A() { 
      super(); 
  }  

  public String sayHello() { 
      System.out.println("Hello World"); 
  }  

  public static void main(String[] args) { 
     A a = new A(); 
     a.sayHello(); 
  } 
}//:-

Lo stile non è il massimo, ma lo considero un bell’esempio per spiegare l’ereditarietà.

[ratings]

Serializzazione


La serializzazione, nei linguaggi ad oggetti, è il processo che permette di memorizzare lo stato interno di un oggetto in supporto fisico, sia esso un disco rigido o una connessione di rete, per poi essere “riletto” in un secondo momento (1).

E’ alla base di molte architetture, gli oggetti salvati in una sessione web devono poter essere serializzabili (2) cosi come quelli inviati tramite iiop, anche se è considerata una falla alla sicurezza. In ogni caso per rendere una classe serializzabile non sempre basta implementare l’interfaccia java.io.serializable, bisogna porre attenzione anche ad alcuni aspetti un po’ particolari.

Non tutto è serializzabile

Spesso non sempre lo stato interno di un oggetto può essere salvato e successivamente caricato senza problemi. Pensiamo ad esempio ai riferimenti a strutture dati del sistema operativo, puntatori ai file o thread in esecuzione, che non hanno senso al di fuori del sistema, o del processo, in cui sono attivi. O ancora a dati con un signifcato legato ad un determinato periodo temporale, ad esempio cache o sessioni, dati che a distanza di tempo diventano quindi inutili. La serializzazione permette infatti di salvare dati che possono essere letti anche a distanza di anni.
Tra l’altro un oggetto è serializzabile solo se tutte le strutture dati a cui fa riferimento sono a loro volta serializabili.

Per evitare che un attributo di un oggetto venga salvato è necessario definirlo come transient. Questi attributi devono essere reinizializzati, se sensato, nella fase di deserializzazione.

Costruttori e deserializzazione 

Dobbiamo fare attenzione al fatto che la deserializzazione crea l’oggetto senza passare dal costruttore, non viene utilizzato nemmeno quello di default. Ad esempio questa classe

public class SerializationTest implements Serializable { 
    private int intValue = 1; 
    private String strValue = "init"; 
    private transient int intTransient = 1; 
    private transient String strTransient = "init";             

    public SerializationTest() { 
        intValue = 2; 
        strValue = "constructor"; 
        intTransient = 2; 
        strTransient = "constructor"; 
    }             

    public void touch() { 
        intValue = 3; 
        strValue = "touched"; 
        intTransient = 3; 
        strTransient = "touched"; 
    }             

    @Override 
    public String toString() { 
        return "Value <" + intValue + "," + strValue + "> Transient <" + intTransient + "," + strTransient + ">"; 
    } 
}//:~

Se utilizzato come segue 

    SerializationTest test = new SerializationTest(); 
    System.out.println("Before touch"); 
    System.out.println(test); 
    test.touch(); 
    System.out.println("Before serialization"); 
    System.out.println(test); 
    File tempFile = File.createTempFile("test", "tmp"); 
    ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(tempFile)); 
    oos.writeObject(test); 
    oos.close();          

    ObjectInputStream ois = new ObjectInputStream(new FileInputStream(tempFile)); 
    SerializationTest test2 = (SerializationTest) ois.readObject(); 
    ois.close();          

    System.out.println("After deserialization"); 
    System.out.println(test2);

uno si aspetterebbe un risultato del tipo: 

Before touch 
Value <2,constructor> Transient <2,constructor> 
Before serialization 
Value <3,touched> Transient <3,touched> 
After deserialization 
Value <3,touched> Transient <1,init>

invece otteniamo come risultato:

Before touch 
Value <2,constructor> Transient <2,constructor> 
Before serialization 
Value <3,touched> Transient <3,touched> 
After deserialization 
Value <3,touched> Transient <0,null>

Lasciando quindi non inizializzati i campi transienti. Infatti uno tra i consigli di thesp0nge è quello di non fidarsi del costruttore. Per superare questo problema si possono definire questi due metodi:

  • private void writeObject(ObjectOutputStream out) throws IOException;
  • private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException;
  • Ad esempio

    private void writeObject(ObjectOutputStream out) throws IOException { 
        out.defaultWriteObject(); 
    }       
    
    private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { 
        in.defaultReadObject(); 
        intTransient = 0; 
        stringTransient = "init"; 
    }

    Da notare che la prima istruzione da chiamare, in entrambi i metodi, è il metodo di default che permette la costruzione corretta dell’oggetto. Altro aspetto da notare il fatto che i metodi sono private void, in questo modo non è possibile ne invocarli dall’esterno ne farne l’override. Questi due metodi sono ad uso e consumo della JVM.

    Versioni delle classi

    Il processo di serializzazione/deserializzazione funziona solo se la struttura delle classi non cambia. Per gestire le versioni della classe java si appoggia alla Serial Version della classe, versione che può cambiare tra una compilazione e la successiva della classe. Nel caso che la versione sia differente il processo di deserializzazione solleva un eccezione del tipo java.io.InvalidClassException.

    Spesso però ciò che cambia in una classe sono solamente i metodi, mentre gli attributi rimangono costanti. In questi casi è possibile forzare la versione definendo manualmente l’attributo private final static long serialversioneUID.

    Bloccare la serializzazione

    Infine per proteggere la classe, evitando che possa essere serializzata, è necessario definire i metodi writeObject e readObject come segue:

    private void writeObject(ObjectOutputStream out) throws IOException { 
        throw new NotSerializableException(); 
    }     
    
    private void readObject(ObjectInputStream in) throws IOException { 
        throw new NotSerializableException(); 
    }

    Per approfondire

    [ratings]

    Note

    (1) Anche se qualcuno mi ha detto che credeva che fosse il processo di mettere in sequenza gli oggetti :D
    (2) Alcuni servlet container memorizzano le sessioni su disco o la condividono tra i vari nodi di un cluster.
    (3) Di fatto i valori di default, 1 e init, associati agli attriobuti in fase di definizione degli attributi sono inutili.