Adding Data Sources

From autoplot.org

Jump to: navigation, search

Contents

  1. Autoplot Internals Overview
    1. QDataSet
    2. Metadata Handling
    3. DataSource
    4. Existing Data Sources
    5. Autoplot
  2. Overview of Tutorial
  3. Create a Dummy DataSource and Register the DataSource Type
    1. Set up project
    2. Define two classes
      1. WavDataSource
      2. WavDataSourceFactory
    3. Register with Autoplot
    4. Test
  4. Implement the DataSource
    1. wav file format
    2. Get a local copy of the file, create QDataSet
    3. Test
  5. Adding Metadata to the DataSource
    1. Return arbitrary Metadata that is presented in the metadata tab
    2. Assigning time tags with the QDataSet model
  6. Adding parameters to the URI
    1. Getting the Parameters
    2. Completion
  7. What's Next

1 Autoplot Internals Overview

The core of Autoplot is four jars

  • QDataSet, the data model
  • DataSource, gets data into the data model
  • DataSourcePack, various implementations of DataSource, including AsciiTable and das2Stream.
  • VirboAutoplot, the application based on all of the jars.

Other jar files are used to read data from different file formats (e.g., CDF, and netCDF) or are used for UI functions (e.g., jars for writing PNG files). To see the list of jar files used, save http://www.autoplot.org/jnlp/autoplot.jnlp as a text file and read its contents.

1.1 QDataSet

  • QDataSet is the data model, which is a Java interface
  • QDataSet implementations, e.g. DDataSet wraps a double array
  • Adapters to das2 data model. QDataSet will eventually become das2 data model
  • Operators
    • slice
    • data reduction
    • aggregation
  • Utilities
    • ascii table parser
    • data set builder

1.2 Metadata Handling

Metadata is conveyed using a tree built from Map objects. The root "node" is Map<String,Object>, and the objects can be another Map<String,Object> branch or immutable objects that are leaves.

1.3 DataSource

  • DataSources know how to model other data models as QDataSets.
  • DataSources also provide a metadata of Map<String,Object> with name=value leaves.
  • DataSourceFactorys
    • translate from URL to DataSource
    • provide completion model to generate valid URLs
  • DataSourceRegistry
    • table of lookups: extension to DataSourceFactory and mime-type to DataSourceFactory
  • DataSetURL
    • uses DataSourceRegistry, DataSourceFactory to create DataSource from URL
    • provides filesystem completion or dataSource completion by delegation.
  • MetaDataModel
    • translates various metadata models into canonical QDataSet metadata model.
    • for example, ISTPMetaDataModel allows cdfs to be interpreted via local file access or via openDAP.
  • org.virbo.aggregator
    • AggregatingDataSource aggregates another datasource using das2 FileStorageModel
    • AggregatingDataSourceFactory uses a delegate DataSourceFactory and a representative file.
  • DataSetSelector GUI component that provides user interface to completion model.
  • DataSource.getCapability adds various capabilities, such as TimeSeriesBrowse

1.4 Existing Data Sources

  • various DataSources know how to model other data models as QDataSets, for example:
    • ASCII--uses internal QDataSet ASCII table parser to read in table
    • Binary--binary files.
    • Excel--uses external Jakarta Poi library to read data from Excel spreadsheets.
    • das2Stream--das2 streams allow streaming of data and metadata. Bob Weigel's TSDS server can send these streams.
    • netCDF--adapts NetCDF data model.
    • OpenDAP--open DAP is a web API for accessing remote data.
    • SPASE--allows SPASE record to wrap another data source, and provide metadata for it.
    • Fits--Fits files used in astronomy
    • Wav--.wav files. This is the result of this tutorial.
    • CDF--ISTP CDF files
    • CEF--Cluster Exchange Format files
    • Jython--Jython code is used to compose datasets.

1.5 Autoplot

  • ApplicationModel is the legacy internal Model and Controller (as in MVC) of the Autoplot application. (The state and things the application can do.)
  • the package "dom" implements the "DOM" tree containing the application state.
  • AutoPlotUI is a GUI View of the model.
  • the package "state" has classes for undo/redo support, and saving application state into .vap files.
  • the package "transferrable" supports copy image to OS clipboard.
  • the package "server" is the back-end server that provides access to console to support scripting.
  • the package "scriptconsole" is a GUI for browsing and executing jython code. It also contains LogConsole, which provides access to log messages.

2 Overview of Tutorial

In this tutorial, we will add the ability to plot .wav files to Autoplot. First, we will create a dummy data source and register it with autoplot. Then we will actually read the data and return the waveform. Last we'll add an additional capability and see how completions are added.

So we can avoid the tedium of jar files and compiling java code, this tutorial assumes you are using Netbeans 6.5+ and have can build Autoplot as described in Autoplot_from_source and Autoplot_from_source_netbeans.

3 Create a Dummy DataSource and Register the DataSource Type

Before writing the code to deal with the wav format, we'll create the data source with a dummy implementation and confirm that it gets properly registered with Autoplot.

3.1 Set up project

  • Create a new project "WavDataSource"
  • Open WavDataSource project properties and click on "Libraries"
    • Click "Add Project" and add QDataSet
    • Click "Add Project" and add DataSource
    • Click "Add Project" and add DasCore

3.2 Define two classes

Create a new package "org.virbo.datasource.wav". All class references are within this package.

3.2.1 WavDataSource

  • Create a new class WavDataSource that extends AbstractDataSource. (An abstract class is a class that is mostly implemented, and subclasses finish the work.)
  • Define a constructor that accepts URL and calls super constructor.
  • Provide an implementation of the abstract method getDataSet. This is a dummy method for testing; the real implementation is presented in the next section.
public QDataSet getDataSet(ProgressMonitor mon) throws Exception {
    return DataSetUtil.replicateDataSet( 30, 1.0 );
}

3.2.2 WavDataSourceFactory

  • Create a new class WavDataSourceFactory that extends AbstractDataSourceFactory.
  • Implement getDataSource(URL url) to simply return new WavDataSource(url)

3.3 Register with Autoplot

This is the tricky part, since no java type checking is done and it's easy to make mistakes.

  • Make a text file META-INF/org.virbo.datasource.DataSourceFactory.extensions that would contain the name of your factory class that has a no-argument constructor, followed by the extensions (e.g. wav).
  • Alternatively, the plugin can be registered by mime type using the text file META-INF/org.virbo.datasource.DataSourceFactory.mimeTypes. Note: mime types are only used with http urls. DEPRECATED: don't use this, it will probably go away.

The META-INF directory should appear in the project's src folder.

For Autoplot to locate and load your new data source, WavDataSource.jar must appear in the classpath. When running from NetBeans, the easiest way to do this is to add the WavDataSource project as a library in the VirboAutoplot project properties.

3.4 Test

  • run autoplot
  • for URL, enter "file:///foo.wav"
  • 30 1.0's should be plotted.

4 Implement the DataSource

We will do all the work in the getDataSet method of WavDataSource.

4.1 wav file format

A wav file is a big binary array, with encoding information in the first 64 bytes of the file. We will use java's AudioSystem class to parse the header. For this example, we will only support mono (one channel) formats. Also, we will not implement the QDataSet interface for now, and we will use java.nio which handles endian encodings.

4.2 Get a local copy of the file, create QDataSet

public QDataSet getDataSet(DasProgressMonitor mon) throws Exception {
    File wavFile = DataSetURL.getFile(this.url, mon);
 
    AudioFileFormat fileFormat = AudioSystem.getAudioFileFormat(wavFile);
    AudioFormat audioFormat = fileFormat.getFormat();
    FileInputStream fin = new FileInputStream(wavFile);
    ByteBuffer byteBuffer = fin.getChannel().map(MapMode.READ_ONLY, 64, wavFile.length() - 64);
 
    int frameSize = audioFormat.getFrameSize();
    int frameCount = (byteBuffer.limit() - byteBuffer.position()) / frameSize;
    int bits = audioFormat.getSampleSizeInBits();
    boolean unsigned= audioFormat.getEncoding().equals(AudioFormat.Encoding.PCM_UNSIGNED );
 
    if ( unsigned ) {
        throw new IllegalArgumentException("Unsupported wave file format: " + audioFormat + ", need signed.");
    }
    if (audioFormat.getChannels() > 1) {
        throw new IllegalArgumentException("Unsupported wave file format: " + audioFormat + ", need mono.");
    }
    if (bits != 16 && bits != 8) {
        throw new IllegalArgumentException("Unsupported wave file format: " + audioFormat + ", need 8 or 16 bits.");
    }
 
    QDataSet result;
 
    if (bits == 16) {
        if (audioFormat.isBigEndian()) {
            byteBuffer.order(ByteOrder.BIG_ENDIAN);
        } else {
            byteBuffer.order(ByteOrder.LITTLE_ENDIAN);
        }
        ShortBuffer shortBuffer = byteBuffer.asShortBuffer();
        short[] buf = new short[frameCount];
        shortBuffer.get(buf);
        result = SDataSet.wrap(buf);
    } else {
        byte[] buf = new byte[frameCount];
        byteBuffer.get(buf);
        result = BDataSet.wrap(buf);
    }
 
    return result;
}

If you are interested in the QDataSet data model, you'll want to look at the QDataSet Intro.

4.3 Test

Run Autoplot again and this time point it to a .wav file. Windows "Sound Recorder" can be used to produce the file, or use this one: [1]

5 Adding Metadata to the DataSource

5.1 Return arbitrary Metadata that is presented in the metadata tab

Override method "getMetaData(ProgressMonitor)":

public Map<String,Object> getMetaData(ProgressMonitor mon) throws Exception {
    AudioFileFormat fileFormat = AudioSystem.getAudioFileFormat(url);
    AudioFormat audioFormat= fileFormat.getFormat();
    Map<String,Object> properies= new HashMap<String,Object>(  );
    properies.put( "encoding", audioFormat.getEncoding() );
    properies.put( "channels", audioFormat.getChannels() );
    properies.put( "frame rate", audioFormat.getFrameRate() );
    properies.put( "bits", audioFormat.getSampleSizeInBits() );
    return properties;
}

(org.das2.util.monitor.ProgressMonitor)

5.2 Assigning time tags with the QDataSet model

To specify physical time tags with the QDataSet model, we can attach time tags to the created data set. To the getDataSet method, we add:

MutablePropertyDataSet timeTags= DataSetUtil.tagGenDataSet( frameCount, 0., 1./audioFormat.getSampleRate() );
timeTags.putProperty( QDataSet.UNITS, Units.seconds );
result.putProperty( QDataSet.DEPEND_0, timeTags );

In the QDataSet model, the property DEPEND_0 contains a dataset with the values of the independent variable.

6 Adding parameters to the URI

Let's add the ability to look at just a part of the .wav file. We'll define two keywords, offset and length to our dataset URIs. For example, file:///data/mywav.wav?offset=1.0&length=0.2 means start at 1.0 seconds into the wav file, and clip after 0.2 seconds.

6.1 Getting the Parameters

Since we are extending AbstractDataSource, the parameters are available in the Map<String,String> params. It should be clear to the reader how this would be implemented, and here is the source: WavDataSource.java

6.2 Completion

The DataSourceFactory can provide a set of completions so that users can more easily use the data source. In our class WavDataSourceFactory, we now override getCompletions:

public List<CompletionContext> getCompletions(CompletionContext cc) {
    List<CompletionContext> result= new ArrayList<CompletionContext>();
    if ( cc.context.equals(CompletionContext.CONTEXT_PARAMETER_NAME ) ) {
        result.add( new CompletionContext( CompletionContext.CONTEXT_PARAMETER_NAME, "offset" ) );
        result.add( new CompletionContext( CompletionContext.CONTEXT_PARAMETER_NAME, "length" ) );
    } else if ( cc.context.equals(CompletionContext.CONTEXT_PARAMETER_VALUE ) ) {
        String paramName= CompletionContext.get( CompletionContext.CONTEXT_PARAMETER_NAME, cc );
        result.add( new CompletionContext( CompletionContext.CONTEXT_PARAMETER_VALUE, "<double>" ) );
    }
    return result;
}

7 What's Next

This tutorial uses existing implementations of QDataSet that force the data to be read into memory. This limits the size of the wav file that can be browsed to about 30 seconds at 8000Hz. One could easily create a QDataSet implementation that wraps the memory-mapped ByteBuffer, so data is only swapped into physical memory as it is accessed. (See java.nio for information about ByteBuffer.)

Personal tools