TrackerMinerFS

TrackerMinerFS — Abstract base class for filesystem miners

Functions

Properties

TrackerDataProvider * data-provider Read / Write / Construct Only
gboolean initial-crawling Read / Write
gboolean mtime-checking Read / Write / Construct
guint processing-pool-ready-limit Read / Write / Construct
guint processing-pool-wait-limit Read / Write / Construct
GFile * root Read / Write / Construct Only
gdouble throttle Read / Write

Signals

void finished Run Last
void finished-root Run Last
gboolean ignore-next-update-file Run Last
gboolean process-file Run Last
gboolean process-file-attributes Run Last
gboolean remove-file Run Last
gboolean writeback-file Run Last

Types and Values

Object Hierarchy

    GObject
    ╰── TrackerMiner
        ╰── TrackerMinerFS

Implemented Interfaces

TrackerMinerFS implements GInitable.

Includes

#include <libtracker-miner/tracker-miner.h>

Description

TrackerMinerFS is an abstract base class for miners that collect data from a filesystem where parent/child relationships need to be inserted into the database correctly with queue management.

All the filesystem crawling and monitoring is abstracted away, leaving to implementations the decisions of what directories/files should it process, and the actual data extraction.

Example creating a TrackerMinerFS with our own file system root and data provider.

First create our class and base it on TrackerMinerFS:

1
2
3
G_DEFINE_TYPE_WITH_CODE (MyMinerFiles, my_miner_files, TRACKER_TYPE_MINER_FS,
                         G_IMPLEMENT_INTERFACE (G_TYPE_INITABLE,
                                                my_miner_files_initable_iface_init))

Later in our class creation function, we are supplying the arguments we want. In this case, the 'root' is a GFile pointing to a root URI location (for example 'file:///') and 'data_provider' is a TrackerDataProvider used to enumerate 'root' and return children it finds. If 'data_provider' is NULL (the default), then a TrackerFileDataProvider is created automatically.

1
2
3
4
5
6
7
8
9
10
// Note that only 'name' is mandatory
miner = g_initable_new (MY_TYPE_MINER_FILES,
                        NULL,
                        error,
                        "name", "MyMinerFiles",
                        "root", root,
                        "data-provider", data_provider,
                        "processing-pool-wait-limit", 10,
                        "processing-pool-ready-limit", 100,
                        NULL);

Functions

tracker_miner_fs_error_quark ()

GQuark
tracker_miner_fs_error_quark (void);

Gives the caller the GQuark used to identify TrackerMinerFS errors in GError structures. The GQuark is used as the domain for the error.

Returns

the GQuark used for the domain of a GError.

Since: 1.2.


tracker_miner_fs_get_indexing_tree ()

TrackerIndexingTree *
tracker_miner_fs_get_indexing_tree (TrackerMinerFS *fs);

Returns the TrackerIndexingTree which determines what files/directories are indexed by fs

Parameters

fs

a TrackerMinerFS

 

Returns

The TrackerIndexingTree holding the indexing configuration.

[transfer none]


tracker_miner_fs_get_data_provider ()

TrackerDataProvider *
tracker_miner_fs_get_data_provider (TrackerMinerFS *fs);

Returns the TrackerDataProvider implementation, which is being used to supply GFile and GFileInfo content to Tracker.

Parameters

fs

a TrackerMinerFS

 

Returns

The TrackerDataProvider supplying content.

[transfer none]

Since: 1.2


tracker_miner_fs_get_throttle ()

gdouble
tracker_miner_fs_get_throttle (TrackerMinerFS *fs);

Gets the current throttle value, see tracker_miner_fs_set_throttle() for more details.

Parameters

fs

a TrackerMinerFS

 

Returns

a double representing a value between 0.0 and 1.0.

Since: 0.8


tracker_miner_fs_get_mtime_checking ()

gboolean
tracker_miner_fs_get_mtime_checking (TrackerMinerFS *fs);

Returns a boolean used to identify if file modification time checks are performed when processing content. This may be set to FALSE if working prodominently with cloud data where you can't perform these checks. By default and for local file systems, this is enabled.

Parameters

fs

a TrackerMinerFS

 

Returns

TRUE if mtime checks for directories against the database are done when fs crawls the file system, otherwise FALSE.

Since: 0.10


tracker_miner_fs_get_initial_crawling ()

gboolean
tracker_miner_fs_get_initial_crawling (TrackerMinerFS *fs);

Returns a boolean which indicates if the indexing tree is crawled upon start up or not. This may be set to FALSE if working prodominently with cloud data where you can't perform these checks. By default and for local file systems, this is enabled.

Parameters

fs

a TrackerMinerFS

 

Returns

TRUE if a file system structure is crawled for new updates on start up, otherwise FALSE.

Since: 0.10


tracker_miner_fs_set_throttle ()

void
tracker_miner_fs_set_throttle (TrackerMinerFS *fs,
                               gdouble throttle);

Tells the filesystem miner to throttle its operations. A value of 0.0 means no throttling at all, so the miner will perform operations at full speed, 1.0 is the slowest value. With a value of 1.0, the fs is typically waiting one full second before handling the next batch of queued items to be processed.

Parameters

fs

a TrackerMinerFS

 

throttle

a double between 0.0 and 1.0

 

Since: 0.8


tracker_miner_fs_set_mtime_checking ()

void
tracker_miner_fs_set_mtime_checking (TrackerMinerFS *fs,
                                     gboolean mtime_checking);

Tells the miner-fs that during the crawling phase, directory mtime checks should or shouldn't be performed against the database to make sure we have the most up to date version of the file being checked at the time. Setting this to FALSE can dramatically improve the start up the crawling of the fs .

The down side is that using this consistently means that some files on the disk may be out of date with files in the database.

The main purpose of this function is for systems where a fs is running the entire time and where it is very unlikely that a file could be changed outside between startup and shutdown of the process using this API.

The default if not set directly is that mtime_checking is TRUE.

Parameters

fs

a TrackerMinerFS

 

mtime_checking

a gboolean

 

Since: 0.10


tracker_miner_fs_set_initial_crawling ()

void
tracker_miner_fs_set_initial_crawling (TrackerMinerFS *fs,
                                       gboolean do_initial_crawling);

Tells the fs that crawling the TrackerIndexingTree should happen initially. This is actually required to set up file system monitor using technologies like inotify, etc.

Setting this to FALSE can dramatically improve the start up the crawling of the fs .

The down side is that using this consistently means that some files on the disk may be out of date with files in the database.

The main purpose of this function is for systems where a fs is running the entire time and where it is very unlikely that a file could be changed outside between startup and shutdown of the process using this API.

The default if not set directly is that do_initial_crawling is TRUE.

Parameters

fs

a TrackerMinerFS

 

do_initial_crawling

a gboolean

 

Since: 0.10


tracker_miner_fs_add_directory_without_parent ()

void
tracker_miner_fs_add_directory_without_parent
                               (TrackerMinerFS *fs,
                                GFile *file);

tracker_miner_fs_add_directory_without_parent is deprecated and should not be used in newly-written code.

Tells the miner-fs that the given GFile corresponds to a directory which was created in the store without a specific parent object. In this case, when regenerating internal caches, an extra query will be done so that these elements are taken into account.

Parameters

fs

a TrackerMinerFS

 

file

a GFile

 

Since: 0.10


tracker_miner_fs_directory_add ()

void
tracker_miner_fs_directory_add (TrackerMinerFS *fs,
                                GFile *file,
                                gboolean recurse);

Tells the filesystem miner to inspect a directory.

Parameters

fs

a TrackerMinerFS

 

file

GFile for the directory to inspect

 

recurse

whether the directory should be inspected recursively

 

Since: 0.8


tracker_miner_fs_directory_remove ()

gboolean
tracker_miner_fs_directory_remove (TrackerMinerFS *fs,
                                   GFile *file);

Removes a directory from being inspected by fs . Note that only directory watches are removed.

Parameters

fs

a TrackerMinerFS

 

file

GFile for the directory to be removed

 

Returns

TRUE if the directory was successfully removed.

Since: 0.8


tracker_miner_fs_directory_remove_full ()

gboolean
tracker_miner_fs_directory_remove_full
                               (TrackerMinerFS *fs,
                                GFile *file);

Removes a directory from being inspected by fs , and removes all associated metadata of the directory (and its contents) from the store.

Parameters

fs

a TrackerMinerFS

 

file

GFile for the directory to be removed

 

Returns

TRUE if the directory was successfully removed.

Since: 0.10


tracker_miner_fs_force_mtime_checking ()

void
tracker_miner_fs_force_mtime_checking (TrackerMinerFS *fs,
                                       GFile *directory);

Tells fs to force mtime checking (regardless of the global mtime check configuration) on the given directory .

Parameters

fs

a TrackerMinerFS

 

directory

a GFile representing the directory

 

Since: 0.12


tracker_miner_fs_check_file ()

void
tracker_miner_fs_check_file (TrackerMinerFS *fs,
                             GFile *file,
                             gboolean check_parents);

Tells the filesystem miner to check and index a file, this file must be part of the usual crawling directories of TrackerMinerFS. See tracker_miner_fs_directory_add().

Parameters

fs

a TrackerMinerFS

 

file

GFile for the file to check

 

check_parents

whether to check parents and eligibility or not

 

Since: 0.10


tracker_miner_fs_check_file_with_priority ()

void
tracker_miner_fs_check_file_with_priority
                               (TrackerMinerFS *fs,
                                GFile *file,
                                gint priority,
                                gboolean check_parents);

Tells the filesystem miner to check and index a file at a given priority, this file must be part of the usual crawling directories of TrackerMinerFS. See tracker_miner_fs_directory_add().

Parameters

fs

a TrackerMinerFS

 

file

GFile for the file to check

 

priority

the priority of the check task

 

check_parents

whether to check parents and eligibility or not

 

Since: 0.10


tracker_miner_fs_check_directory ()

void
tracker_miner_fs_check_directory (TrackerMinerFS *fs,
                                  GFile *file,
                                  gboolean check_parents);

Tells the filesystem miner to check and index a directory, this file must be part of the usual crawling directories of TrackerMinerFS. See tracker_miner_fs_directory_add().

Parameters

fs

a TrackerMinerFS

 

file

GFile for the directory to check

 

check_parents

whether to check parents and eligibility or not

 

Since: 0.10


tracker_miner_fs_check_directory_with_priority ()

void
tracker_miner_fs_check_directory_with_priority
                               (TrackerMinerFS *fs,
                                GFile *file,
                                gint priority,
                                gboolean check_parents);

Tells the filesystem miner to check and index a directory at a given priority, this file must be part of the usual crawling directories of TrackerMinerFS. See tracker_miner_fs_directory_add().

Parameters

fs

a TrackerMinerFS

 

file

GFile for the directory to check

 

priority

the priority of the check task

 

check_parents

whether to check parents and eligibility or not

 

Since: 0.10


tracker_miner_fs_force_recheck ()

void
tracker_miner_fs_force_recheck (TrackerMinerFS *fs);

tracker_miner_fs_writeback_file ()

void
tracker_miner_fs_writeback_file (TrackerMinerFS *fs,
                                 GFile *file,
                                 GStrv rdf_types,
                                 GPtrArray *results);

Tells the filesystem miner to writeback a file.

Parameters

fs

a TrackerMinerFS

 

file

GFile for the file to check

 

rdf_types

A GStrv with rdf types

 

results

A array of results from the preparation query.

[element-type GStrv]

Since: 0.10.20


tracker_miner_fs_writeback_notify ()

void
tracker_miner_fs_writeback_notify (TrackerMinerFS *fs,
                                   GFile *file,
                                   const GError *error);

Notifies fs that all writing back on file has been finished, if any error happened during file data processing, it should be passed in error , else that parameter will contain NULL to reflect success.

Parameters

fs

a TrackerMinerFS

 

file

a GFile

 

error

a GError with the error that happened during processing, or NULL.

 

Since: 0.10.20


tracker_miner_fs_file_notify ()

void
tracker_miner_fs_file_notify (TrackerMinerFS *fs,
                              GFile *file,
                              const GError *error);

Notifies fs that all processing on file has been finished, if any error happened during file data processing, it should be passed in error , else that parameter will contain NULL to reflect success.

Parameters

fs

a TrackerMinerFS

 

file

a GFile

 

error

a GError with the error that happened during processing, or NULL.

 

Since: 0.8


tracker_miner_fs_get_urn ()

const gchar *
tracker_miner_fs_get_urn (TrackerMinerFS *fs,
                          GFile *file);

If the item exists in the store, this function retrieves the URN for a GFile being currently processed.

If file is not being currently processed by fs , or doesn't exist in the store yet, NULL will be returned.

Parameters

fs

a TrackerMinerFS

 

file

a GFile obtained in “process-file”

 

Returns

The URN containing the data associated to file , or NULL.

[transfer none][nullable]

Since: 0.8


tracker_miner_fs_get_parent_urn ()

const gchar *
tracker_miner_fs_get_parent_urn (TrackerMinerFS *fs,
                                 GFile *file);

If file is currently being processed by fs , this function will return the parent folder URN if any. This function is useful to set the nie:belongsToContainer relationship. The processing order of TrackerMinerFS guarantees that a folder has been already fully processed for indexing before any children is processed, so most usually this function should return non-NULL.

Parameters

fs

a TrackerMinerFS

 

file

a GFile obtained in “process-file”

 

Returns

The parent folder URN, or NULL.

[transfer none][nullable]

Since: 0.8


tracker_miner_fs_query_urn ()

gchar *
tracker_miner_fs_query_urn (TrackerMinerFS *fs,
                            GFile *file);

If the item exists in the store, this function retrieves the URN of the given GFile

If file doesn't exist in the store yet, NULL will be returned.

Parameters

fs

a TrackerMinerFS

 

file

a GFile

 

Returns

A newly allocated string with the URN containing the data associated to file , or NULL.

[transfer full]

Since: 0.10


tracker_miner_fs_has_items_to_process ()

gboolean
tracker_miner_fs_has_items_to_process (TrackerMinerFS *fs);

The fs keeps many priority queus for content it is processing. This function returns TRUE if the sum of all (or any) priority queues is more than 0. This includes items deleted, created, updated, moved or being written back.

Parameters

fs

a TrackerMinerFS

 

Returns

TRUE if there are items to process in the internal queues, otherwise FALSE.

Since: 0.10

Types and Values

struct TrackerMinerFS

struct TrackerMinerFS;

Abstract miner implementation to get data from the filesystem.


TrackerMinerFSClass

typedef struct {
	TrackerMinerClass parent;

	gboolean (* process_file)             (TrackerMinerFS       *fs,
	                                       GFile                *file,
	                                       TrackerSparqlBuilder *builder,
	                                       GCancellable         *cancellable);
	gboolean (* ignore_next_update_file)  (TrackerMinerFS       *fs,
	                                       GFile                *file,
	                                       TrackerSparqlBuilder *builder,
	                                       GCancellable         *cancellable);
	void     (* finished)                 (TrackerMinerFS       *fs,
	                                       gdouble               elapsed,
	                                       gint                  directories_found,
	                                       gint                  directories_ignored,
	                                       gint                  files_found,
	                                       gint                  files_ignored);
	gboolean (* process_file_attributes)  (TrackerMinerFS       *fs,
	                                       GFile                *file,
	                                       TrackerSparqlBuilder *builder,
	                                       GCancellable         *cancellable);
	gboolean (* writeback_file)           (TrackerMinerFS       *fs,
	                                       GFile                *file,
	                                       GStrv                 rdf_types,
	                                       GPtrArray            *results);
	void     (* finished_root)            (TrackerMinerFS       *fs,
	                                       GFile                *root,
	                                       gint                  directories_found,
	                                       gint                  directories_ignored,
	                                       gint                  files_found,
	                                       gint                  files_ignored);
	gboolean (* remove_file)              (TrackerMinerFS       *fs,
	                                       GFile                *file,
	                                       gboolean              children_only,
	                                       TrackerSparqlBuilder *builder);

	/* <Private> */
	gpointer padding[8];
} TrackerMinerFSClass;

Prototype for the abstract class, process_file must be implemented in the deriving class in order to actually extract data.

Members

TrackerMinerClass parent;

parent object class

 

process_file ()

Called when the metadata associated to a file is requested.

 

ignore_next_update_file ()

Called after a writeback event happens on a file (deprecated since 0.12).

 

finished ()

Called when all processing has been performed.

 

process_file_attributes ()

Called when the metadata associated with a file's attributes changes, for example, the mtime.

 

writeback_file ()

Called when a file must be written back

 

finished_root ()

Called when all resources on a particular root URI have been processed.

 

remove_file ()

   

gpointer padding[8];

Reserved for future API improvements.

 

enum TrackerMinerFSError

Possible errors returned when calling creating new objects based on the TrackerMinerFS type and other APIs available with this class.

Members

TRACKER_MINER_FS_ERROR_INIT

There was an error during initialization of the object. The specific details are in the message.

 

Since: 1.2.

Property Details

The “data-provider” property

  “data-provider”            TrackerDataProvider *

Data provider populating data, e.g. like GFileEnumerator.

Flags: Read / Write / Construct Only


The “initial-crawling” property

  “initial-crawling”         gboolean

Whether to perform initial crawling or not.

Flags: Read / Write

Default value: TRUE


The “mtime-checking” property

  “mtime-checking”           gboolean

Whether to perform mtime checks during initial crawling or not.

Flags: Read / Write / Construct

Default value: TRUE


The “processing-pool-ready-limit” property

  “processing-pool-ready-limit” guint

Maximum number of SPARQL updates that can be merged in a single connection to the store.

Flags: Read / Write / Construct

Allowed values: >= 1

Default value: 1


The “processing-pool-wait-limit” property

  “processing-pool-wait-limit” guint

Maximum number of files that can be concurrently processed by the upper layer.

Flags: Read / Write / Construct

Allowed values: >= 1

Default value: 1


The “root” property

  “root”                     GFile *

Top level URI for our indexing tree and file notify clases.

Flags: Read / Write / Construct Only


The “throttle” property

  “throttle”                 gdouble

Modifier for the indexing speed, 0 is max speed.

Flags: Read / Write

Allowed values: [0,1]

Default value: 0

Signal Details

The “finished” signal

void
user_function (TrackerMinerFS *miner_fs,
               gdouble         elapsed,
               guint           directories_found,
               guint           directories_ignored,
               guint           files_found,
               guint           files_ignored,
               gpointer        user_data)

The ::finished signal is emitted when miner_fs has finished all pending processing.

Parameters

miner_fs

the TrackerMinerFS

 

elapsed

elapsed time since mining was started

 

directories_found

number of directories found

 

directories_ignored

number of ignored directories

 

files_found

number of files found

 

files_ignored

number of ignored files

 

user_data

user data set when the signal handler was connected.

 

Flags: Run Last

Since: 0.8


The “finished-root” signal

void
user_function (TrackerMinerFS *miner_fs,
               GFile          *file,
               gpointer        user_data)

The ::finished-crawl signal is emitted when miner_fs has finished finding all resources that need to be indexed with the root location of file . At this point, it's likely many are still in the queue to be added to the database, but this gives some indication that a location is processed.

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

user_data

user data set when the signal handler was connected.

 

Flags: Run Last

Since: 1.2


The “ignore-next-update-file” signal

gboolean
user_function (TrackerMinerFS       *miner_fs,
               GFile                *file,
               TrackerSparqlBuilder *builder,
               GCancellable         *cancellable,
               gpointer              user_data)

The ::ignore-next-update-file signal is emitted whenever a file should be marked as to ignore on next update, and it's metadata prepared for that.

builder is the TrackerSparqlBuilder where all sparql updates to be performed for file will be appended.

TrackerMinerFS::ignore-next-update-file has been deprecated since version 0.12 and should not be used in newly-written code.

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

builder

a TrackerSparqlBuilder

 

cancellable

a GCancellable

 

user_data

user data set when the signal handler was connected.

 

Returns

TRUE on success FALSE on failure

Flags: Run Last

Since: 0.8


The “process-file” signal

gboolean
user_function (TrackerMinerFS       *miner_fs,
               GFile                *file,
               TrackerSparqlBuilder *builder,
               GCancellable         *cancellable,
               gpointer              user_data)

The ::process-file signal is emitted whenever a file should be processed, and it's metadata extracted.

builder is the TrackerSparqlBuilder where all sparql updates to be performed for file will be appended.

This signal allows both synchronous and asynchronous extraction, in the synchronous case cancellable can be safely ignored. In either case, on successful metadata extraction, implementations must call tracker_miner_fs_file_notify() to indicate that processing has finished on file , so the miner can execute the SPARQL updates and continue processing other files.

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

builder

a TrackerSparqlBuilder

 

cancellable

a GCancellable

 

user_data

user data set when the signal handler was connected.

 

Returns

TRUE if the file is accepted for processing, FALSE if the file should be ignored.

Flags: Run Last

Since: 0.8


The “process-file-attributes” signal

gboolean
user_function (TrackerMinerFS       *miner_fs,
               GFile                *file,
               TrackerSparqlBuilder *builder,
               GCancellable         *cancellable,
               gpointer              user_data)

The ::process-file-attributes signal is emitted whenever a file should be processed, but only the attribute-related metadata extracted.

builder is the TrackerSparqlBuilder where all sparql updates to be performed for file will be appended. For the properties being updated, the DELETE statements should be included as well.

This signal allows both synchronous and asynchronous extraction, in the synchronous case cancellable can be safely ignored. In either case, on successful metadata extraction, implementations must call tracker_miner_fs_file_notify() to indicate that processing has finished on file , so the miner can execute the SPARQL updates and continue processing other files.

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

builder

a TrackerSparqlBuilder

 

cancellable

a GCancellable

 

user_data

user data set when the signal handler was connected.

 

Returns

TRUE if the file is accepted for processing, FALSE if the file should be ignored.

Flags: Run Last

Since: 0.10


The “remove-file” signal

gboolean
user_function (TrackerMinerFS       *miner_fs,
               GFile                *file,
               gboolean              children_only,
               TrackerSparqlBuilder *builder,
               gpointer              user_data)

The ::remove-file signal will be emitted on files that need removal according to the miner configuration (either the files themselves are deleted, or the directory/contents no longer need inspection according to miner configuration and their location.

This operation is always assumed to be recursive, the children_only argument will be TRUE if for any reason the topmost directory needs to stay (e.g. moved from a recursively indexed directory tree to a non-recursively indexed location).

The builder argument can be used to provide additional SPARQL deletes and updates necessary around the deletion of those items. If the return value of this signal is TRUE, builder is expected to contain all relevant deletes for this operation.

If the return value of this signal is FALSE, the miner will apply its default behavior, which is deleting all triples that correspond to the affected URIs.

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

children_only

TRUE if only the children of file are to be deleted

 

builder

a TrackerSparqlBuilder

 

user_data

user data set when the signal handler was connected.

 

Returns

TRUE if builder contains all the necessary operations to delete the affected resources, FALSE to let the miner implicitly handle the deletion.

Flags: Run Last

Since: 1.8


The “writeback-file” signal

gboolean
user_function (TrackerMinerFS *miner_fs,
               GFile          *file,
               GStrv           rdf_types,
               GPtrArray      *results,
               GCancellable   *cancellable,
               gpointer        user_data)

The ::writeback-file signal is emitted whenever a file must be written back

Parameters

miner_fs

the TrackerMinerFS

 

file

a GFile

 

rdf_types

the set of RDF types

 

results

a set of results prepared by the preparation query.

[element-type GStrv]

cancellable

a GCancellable

 

user_data

user data set when the signal handler was connected.

 

Returns

TRUE on success, FALSE otherwise

Flags: Run Last

Since: 0.10.20