Python API documentation
SLMotion can be controlled with general purpose Python scripts, using an API that it provides. This document provides a short reference of the different functions and classes that the API exposes. All of the content described here is contained within the slmotion module in the embedded Python interpreter, which can be accessed by the usual import means.
Index
- Function and class reference
- Type conversion information
- References
Function and class reference
process(skip=0, framesMax=SIZE_MAX)
Overview
Runs the component chain on each video with the given settings. Usually, this is the last function to be called.
Arguments
- skip The number of frames to skip from the beginning of the video (i.e. the 0-based frame number of the first frame to be processed)
- framesMax The maximum number of frames to process (starting the count from the non-skipped frame).
Return values
None
setAnnotationTextFormat(format)
Overview
Sets the format of the text displayed on the annotation bar below the output video. The syntax resembles that of the printf function in C, i.e. special characters are marked with % followed by a single character. The list of these variables is shown below:
- %f: current frame number
- %t: current annotation text (as given in the annotation file, see the --ann-file command line parameter)
- %l: the PicSOM label. If you do not know what PicSOM is, you may ignore this variable.
- %%: The % sign.
Arguments
- format The desired format string. The default is "Frame: %f".
Return values
None
class Component
Overview
This class is a Python abstraction of the C++ components. Basically, what it does is provide the necessary information for instantiating the actual component objects on the C++ side.
Component.Component(componentName, options)
Overview
This is the relevant constructor for the Component class. The values passed here will be used for instantiating the actual C++ components later on.
Arguments
- componentName The short name of the corresponding C++ component.
- options A dictionary providing the relevant options (much like in the confiugration files, see default.conf). Each key is assumed to be a (non-prefixed) string, corresponding to the relevant option. The values may typically be strings, integers, or floating point numbers. Please see the component reference for more accurate information.
Return values
None
setComponents(componentList)
Overview
Sets the components to be used in the given component chain.
Arguments
- componentList A list of Component objects.
Return values
None
verbose(level=1)
Overview
Sets the verbosity level, i.e. the amount of debugging information printed to stdout.
Arguments
- level Verbosity level. If this function is never called, the default is 0. The default value for the call is 1 (corresponding to the --verbose command line option). Values higher than 1 will provide more information but it is likely to be rather useless to the end-user.
Return values
None
setVisualisation(script)
Overview
Sets a simple script to control the visualisation facilites of the SLMotion process. The script consists of simple semi-colon separated statements. Some of the recognised statements are shown below.
Arguments
- script A script composed of semi-colon
separated statements. Here are some such statements:
- cropToRect <rect> Given a rectangle, crops the image and cuts out anything outside the rectangle.
- locate <x> <y> Sets the cursor location. The cursor corresponds to the upper left corner of the virtual image where visualisation is rendered. Effectively, this statement can be used to e.g. draw several matrices side-by-side. The default cursor location is (0,0). x can be an integer or frame-width, and, likewise, y can be an integer or frame-height.
- showAsms Tries to draw the left and right hand and head fitted ASMs.
- showBlobs Draws the detected blobs.
- showBodyParts Draws the detected body part blobs.
- showFrame Draws the default RGB frame from the default track.
- showFrame depth Draws the depth information from the first depth frame. This is mainly intended for Kinect data.
- showFrame <integer> Draws the input frame from the given track. The track number is 0-based.
- showKinectSkeleton Supposing that the input is Kinect format (typically in an ONI file), and it has associated skeleton information, this will draw yellow lines about the joints, revealing the skeleton.
- showMatrix <key> Fetches the matrix by the key for the current frame and draws it. Drawing only works if the matrix type is either CV_8UC1 or CV_8UC3, or the matrix is single channel, floating point valued, and the elements are in the range [0,1]. Otherwise, the results are undefined.
- showPoints <vector> Given a vector of points, draws a small circle around each point. Please see the type section at the bottom of this document for more information about the format of these vectors.
- showRect <name> Draws a rectangle on the screen. name specifies the name of the Black Board entry containing the rectangle.
- showRects <name> Draws the rectangles in a vector of rectangles on the screen. name specifies the name of the Black Board entry containing the vector of rectangles.
- showSuviSegments Draws the face area segments as defined in [1]. Note: This option requires libflandmark [2] support to be enabled. Drawing these segments requires that the output of the FaceSuviSegmentator component is available on the black board.
Return values
None
class BlackBoard
Overview
This class is a Python abstraction of the C++ BlackBoard. The class represents a kind of data storage through which data passes between components. The data is stored on two tables: a frame-specific table and a global table. The only difference is that data on the frame-specific table is indexed with a frame number along with they property key. Property keys are component-specific strings that identify different data entries. This class has no public constructors, so the only instance is simply called "slmotion.blackboard" (note the case), so it may be regarded as a singleton on the Python side.
BlackBoard.has(propertyName, framenumber=default)
Overview
Queries the black board for the existence of entries.
Arguments
- propertyName The name of the entry (e.g. facedetection). Please see the component reference for details.
- framenumber Frame number. This is an optional argument; if it is not given, the entries are sought on the global table. However, if it is given, the frame-specific table is used instead.
Return values
A boolean value.
BlackBoard.get(propertyName, framenumber=default)
Overview
Queries the black board for the existence of an entry, and returns it converted to a Python type. The details of the conversion depend entirely on the type in question.
Arguments
- propertyName The name of the entry (e.g. facedetection). Please see the component reference for details.
- framenumber Frame number. This is an optional argument; if it is not given, the entries are sought on the global table. However, if it is given, the frame-specific table is used instead.
Return values
An object of an appropriate type.
BlackBoard.set(object, propertyName, framenumber=default, attribs=default)
Overview
Sets the given object as the entry for the propertyName/framenumber key on the black board.
Arguments
- object Any Python object that has a supported conversion to a C++ type. Basically, this includes the entries supported by the components. Please see the component reference for details.
- propertyName The name of the entry (e.g. facedetection). Please see the component reference for details.
- framenumber Frame number. This is an optional argument; if it is not given, the entries are sought on the global table. However, if it is given, the frame-specific table is used instead. The global table is also used if the frame number is negative (i.e. in case the user might want to use non-default attributes).
- attribs Attributes that affect the storage behaviour of the object. The default value is COMPRESS_WITHOUT_REFERENCES which will apply DEFLATE compression on the data if it is not being referenced to by any pointer on the C++ side. Note, however, that Python code never references the data because the data is always copied when moving it to the Python side. However, the compression/decompression is time consuming but will save memory (particularly when dealing with binary matrices). If you want to disable compression, please use NONE as the attribute.
Return values
None.
BlackBoard.setAs(object, typespec, propertyName, framenumber=default, attribs=default)
Overview
Sets the given object as the entry for the propertyName/framenumber key on the black board. This function behaves almost exactly like set(), except that conversion to a specific type can be requested via the typespec parameter.
Arguments
- object Any Python object that has a supported conversion to a C++ type. Basically, this includes the entries supported by the components. Please see the component reference for details.
- typespec The string name of the target type. Currently, the following types are supported: unsigned int, (unsigned) long, (unsigned) short, float, Rect, Point, Point2f, vector<Point>, vector<Point2f>, and vector<Rect>.
- propertyName The name of the entry (e.g. facedetection). Please see the component reference for details.
- framenumber Frame number. This is an optional argument; if it is not given, the entries are sought on the global table. However, if it is given, the frame-specific table is used instead. The global table is also used if the frame number is negative (i.e. in case the user might want to use non-default attributes).
- attribs Attributes that affect the storage behaviour of the object. The default value is COMPRESS_WITHOUT_REFERENCES which will apply DEFLATE compression on the data if it is not being referenced to by any pointer on the C++ side. Note, however, that Python code never references the data because the data is always copied when moving it to the Python side. However, the compression/decompression is time consuming but will save memory (particularly when dealing with binary matrices). If you want to disable compression, please use NONE as the attribute.
Return values
None.
BlackBoard.delete(propertyName, framenumber=default)
Overview
Removes the given instance from the global board, or if framenumber is given, from the frame board.
Arguments
- propertyName The name of the entry (e.g. facedetection). Please see the component reference for details.
- framenumber Frame number. This is an optional argument; if it is not given, the entries are sought on the global table. However, if it is given, the frame-specific table is used instead. The global table is also used if the frame number is negative (i.e. in case the user might want to use non-default attributes).
Return values
None.
BlackBoard.deleteAll(propertyName)
Overview
Removes all instances from the global board and the frame board that go by the given name.
Arguments
- propertyName The name of the entries (e.g. facedetection). Please see the component reference for details.
Return values
None.
BlackBoard.clear()
Overview
Removes everything from the black board.
Arguments
None.
Return values
None.
BlackBoard.COMPRESS_WITHOUT_REFERENCES
Overview
A constant that specifies the attribute value that makes matrices be compressed using the DEFLATE algorithm when they are not being referred to by a C++ pointer.
BlackBoard.NONE
Overview
A constant that specifies the attribute value that makes matrices never compressed.
class FrameSource
Overview
This class is a Python abstraction of the C++ FrameSource. The class provides access to the input fed to the components. Access is done on a per-frame basis via the bracket operator. Additionally, there may be several subtracks embedded into the one object, possibly representing depth information, another camera angle or some other type of information. This class has no public constructors, so the only public instance is simply called "slmotion.blackboard" (note the case). However, the class should not be regarded as a singleton on the Python side because additional instances may be available through the getTrack() function.
FrameSource.__getitem__(framenumber)
Overview
This is the bracket operator. It can be used to access the default track.
Arguments
- framenumber Frame number of the desired frame
Return values
A numpy.ndarray representing the desired frame.
FrameSource.__len__()
Overview
Queries the total number of frames.
Arguments
None.
Return values
The number of frames as an integer.
FrameSource.getNumberOfTracks()
Overview
Queries the number of tracks.
Arguments
None.
Return values
The number of tracks as an integer.
FrameSource.getTrack(trackNumber)
Overview
Queries a subtrack. An exception is raised if trackNumber is not within [0,getNumberOfTracks()-1].
Arguments
- trackNumber The 0-based track index.
Return values
Another FrameSource instance representing the desired subtrack.
constructFeatureVector(specification, firstFrame=0, lastFrame=SIZE_MAX)
Overview
Constructs a feature vector from entries on the black board according to a specification.
Arguments
- specification A list of dictionaries
describing each feature. Features here are simple numerical
values. Here is a list of keys that can be used:
- property Blackboard property name. This is a compulsory setting. The value should be a string. Special values inlcude "framenr" and "timecode", which do not take their values from the blackboard but instead inject the current frame number and the corresponding timecode, respectively.
- interpretation Interpretation of
the value on the black board. The value should be a string. List
of interpretations is as follows:
- Value Use the numerical value verbatim.
- Expand "Expands" a multi-valued entity to a vector of specific size (e.g. a point vector of 8 points would become a 16-value vector). The values are then injected and the output vector is expanded to make room for the values.
- size Size of the vector when using the "Expand" interpretation.
- name Name of the feature. This is a compulsory setting. The value should be a string.
- minimum Minimum value of the feature. This is an optional setting. If left unspecified, -DBL_MAX is used. The value should be a floating point number.
- maximum Maximum value of the feature. This is an optional setting. If left unspecified, DBL_MAX is used. The value should be a floating point number.
- type Type of the feature. This is
an optional setting. If left
unspecified, 'real' is used. The value
should be a string. The following values are recognised
- real A floating point number
- integer An integer number
- boolean A binary truth value
- unit Real world unit of the feature (e.g. ms, Hz, m, etc.). This is an optional setting. If left unspecified, 1 is used. The value should be a string.
- firstFrame First frame number, inclusive. Default value is 0.
- lastFrame Last frame number, inclusive. Default value is SIZE_MAX which will be interpreted as the last possible frame as indicated by the default frame source object.
Return values
None
outputVideo(first=default, last=default)
Overview
Performs visualisation and writes the results to the video output file (or image sequence).
Arguments
- first First frame number, inclusive. Default value is 0.
- last Last frame number, exclusive. Default value is SIZE_MAX.
Return values
None
storeFeatureData()
Overview
Writes the feature vector constructed above onto disk.
Arguments
None
Return values
None
storeElanAnnotations()
Overview
Writes the Elan annotations constructed from the feature vector according to the annotation templates to disk.
Arguments
None
Return values
None
class PythonComponentBase
Overview
This is a base class that can be used to aid the creation of new Python-based components. The class is not intended to be instantiated on its own; rather, it should work only as a base class to user-created actual component classes. For an example, please see PythonExampleComponent.py.
The class is abstract in the sense that attempting to use it as such will raise a NotImplementedError exception. The user should override the basic functions that report some information about the class (i.e. the component name and the requirements and such), and either process() or processRange(), or both. A default implementation is provided for processRange() that will call process() function for each frame in the range.
Note: The module also contains a class called PythonComponentBaseBase which is a base of this class. The user should not touch it. It is for internal use only, and is related to the C++/Python wrapping facilities. That class does not expose any members in its Python interface.
PythonComponentBase.processRange(firstFrame, lastFrame)
Overview
The default implementation calls the process() function for each frame i in firstFrame, firstFrame+1, ... , lastFrame-1, i.e. the first frame number is inclusive, the last frame number is exclusive. The frame numbers are 0-based, and, naturally, lastFrame > firstFrame ought to be true.
The function should call the callback() function from time to time to report the progress to the user interface and check if the user has requested process termination. The default implementation does so after each process() function call.
Arguments
- firstFrame Inclusive first frame to process
- lastFrame Exclusive last frame.
Return values
A boolean value that indicates if the process was completed successfully.
PythonComponentBase.process(framenumber)
Overview
The default implementation simply raises a NotImplementedError exception. The overridden version of this function should contain the logic to perform any necessary processing on a single frame as requested in the framenumber parameter.
This function is of course not necessary if single frame processing is not meaningful to the component. In that case, the user is requested to override the processRange() function.
Arguments
- framenumber The 0-based frame number of the frame that is to be processed.
Return values
None.
PythonComponentBase.callback(percentage)
Overview
Reports the percentage complete to the user interface, and returns information about whether the user wants to terminate the process prematurely. This function cannot be overridden.
Arguments
- percentage A value in the range [0,100] that will tell the user interface the percentage of the process that has been completed.
Return values
A boolean value. True will mean that everything should go on normally. False will mean that the process should be terminated as soon as possible.
PythonComponentBase.getShortDescription()
Overview
Returns a short description, typically about one sentence long, that briefly describes what the component does. This text may appear e.g. on user interfaces. The function should be overridden.
Arguments
None.
Return values
A string. The default implementation returns "A Python component".
PythonComponentBase.getLongDescription()
Overview
Returns a detailed description that describes what the component does. This text may appear e.g. on documentation. The function should be overridden.
Arguments
None.
Return values
A string. The default implementation returns "A Python configuration file help text".
PythonComponentBase.getRequirements()
Overview
Returns a set of strings that describe which properties should exist on the black board before the component is run. The user should override this function; the default implementation will only raise a NotImplementedError exception.
Arguments
None.
Return values
A set of strings.
PythonComponentBase.getProvided()
Overview
Returns a set of strings that describe which properties the component will store (add or modify) on the black board when it is run. The user should override this function; the default implementation will only raise a NotImplementedError exception.
Arguments
None.
Return values
A set of strings.
PythonComponentBase.getShortName()
Overview
Returns a short name for the component. It should not contain white space characters. This name is used when identifying the component internally, or by the user. Typically, it should correspond to the name of the class. The user should override this function; the default implementation will only raise a NotImplementedError exception.
Arguments
None.
Return values
A string.
PythonComponentBase.getComponentName()
Overview
Returns a user-readable name for the component. It may contain white space characters. This name is used e.g. when displaying information about the component in the user interface. The user should override this function; the default implementation will only raise a NotImplementedError exception.
Arguments
None.
Return values
A string.
Type conversion information
This section explains how certain type conversions are performed when moving data between Python and C++ environments.
Matrices
Matrices are typically presented as cv::Mat objects on the C++ side. On the Python side, numpy.ndarrays are used because this is the convention adopted by the newer OpenCV Python interface. Conversions - when performed through the BlackBorad - are very straightforward: the data is copied and mapped to a similar data type. There is typically a 1:1 correspondence between matrices and their data types.
However, the dimensionalities have different interpretations. Vectors are considered 2D matrices in OpenCV, whereas numpy assumes one dimension. Therefore, matrices with unit length in one dimension are mapped to one dimension. On the other hand, typically colour images are represented as 2D matrices with three channels in OpenCV. Numpy does not know the channel notion, so these images are mapped to 3D ndarrays.
Rectangles
Rectangles are typically presented as cv::Rect objects on the C++ side. However, the cv2 interface does not seem to provide an equivalent. Therefore, cv::Rect objects are converted to one-dimensional numpy.ndarrays with four elements: top-left x coordinate, top-left y coordinate, the width, and the height, respectively. Because conversion to the other side is ambiguous, and the interpretation that an numpy.ndarray represents a matrix is preferred, rectangles can be explicitly constructed by using the setAs() function with "Rect" as the type specifier.
Point vectors
Point vectors are typically presented as std::vector<cv::Point> or std::vector<cv::Point2f> objects on the C++ side, depending on the data type. However, the cv2 interface does not seem to provide an equivalent to the Point class. Instead, points are represented as 2-element row vectors, and point vectors as 3-dimensional matrices whose first dimension corresponds to the number of points, second dimension is one, and the third dimension is 2. Therefore, point vector objects are converted to three-dimensional numpy.ndarrays. Because conversion to the other side is ambiguous, and the interpretation that an numpy.ndarray represents a matrix is preferred, rectangles can be explicitly constructed by using the setAs() function with "vector<Point>" or "vector<Point2f>" as the type specifier.
Rectangle vectors
Rectangle vectors are typically presented as std::vector<cv::Rect> objects on the C++ side. However, the cv2 interface does not seem to provide an equivalent to the Rect class, as noted above. Therefore, on the Python side, rectangle vectors are represented as lists of ndarrays. When storing these vectors on the black board, setAs with vector<Rect> as type specifier may be needed to disambiguate the list from a list of ndarrays.
Type conversion table
This table summarises how slmotion handles type conversions between C++ and Python environments. In all cases, data is copied.
C++ type | Python type | Notes |
(unsigned) short | int | The Python int corresponds to C long in CPython. |
(unsigned) int | int | The Python int corresponds to C long in CPython. |
(unsigned) long | int | The Python int corresponds to C long in CPython. |
float | float | The Python float internally corresponds to IEEE 754 double precision |
double | float | The Python float internally corresponds to IEEE 754 double precision |
FeatureVector | list (of lists of floats) | When converting to a Python object, the feature vector is first converted into a deque of double vectors. The deque is then converted to a list of lists of floats. Each inner list corresponds to one frame, and they all have equal dimensionality. Each element in the inner lists corresponds to one feature entry. All metadata is lost in the conversion. |
cv::Mat | numpy.ndarray | Precise conversion depends on the type of the matrix (i.e. data type, dimensionality, and number of channels) |
cv::Rect | numpy.ndarray | Explicit conversion is available via setAs(). |
cv::Point | numpy.ndarray | Explicit conversion is available via setAs(). |
cv::Point2f | numpy.ndarray | Explicit conversion is available via setAs(). |
References
[1] Ville Viitaniemi, Matti Karppa, Jorma Laaksonen, and Tommi Jantunen. Detecting Hand-Head Occlusions in Sign Language Video. In Proceedings of the 18th Scandinavian Conference on Image Analysis (SCIA 2013), Espoo, Finland, June 2013. Available online at http://users.ics.aalto.fi/jmkarppa/scia2013occlusion.pdf.
[2] M. Uřičář, V. Franc and V. Hlaváč, Detector of Facial Landmarks Learned by the Structured Output SVM, VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, 2012. pdf