www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

Administration

Database Server Administration
Database Virtual Database Virtuoso User Model VAD - Virtuoso Application Distribution Data Backup & Recovery Performance diagnostics Performance Tuning
HTML based Administration Console (Conductor) Guide
Virtuoso Cluster Fault Tolerance

6.1. Database Server Administration

This section explains the use of the server executables, configuration files and general administration procedures. These include backup, server tuning and profiling, installing extension packages etc. See the Access Interfaces chapter for a discussion of the command line SQL interface.

6.1.1. Database

See details for Installation Requirements, Operational Requirements, Operating System Support and Limits.

6.1.1.1. Server Instance Creation

Multiple Virtuoso server instances may coexist on a single machine. Each database will need to be assigned a unique TCP port number. On the windows platforms (except 95,98,ME) the Virtuoso can be configured for multiple services. For further details, see the Creating and Deleting Virtuoso Services in the Installation chapter.

To run Virtuoso on a machine, only the Virtuoso server executable and a valid virtuoso.ini file are needed. An empty database file will be created automatically at first startup. None of the other files in the installation are needed for basic operation. Client interfaces may additionally need ODBC drivers or the like to be set up in system configuration files or the Windows registry but the server itself does not depend on these.


6.1.1.2. Installing Application Packages

Virtuoso comes with optionally installable SQL application packages (VAD packages) for web based admin, on-line documentation, programming tutorials and a BPEL processor. The installer typically asks whether to install these into the demo or the default empty database. Depending on the OS and form of installer, also depending on whether the installation is the commercial or open source one, the package files will be in different locations. To locate them, look for *.vad. The packages are typically made in two variants, where one keeps the installed items in the DAV repository and the other in the file system. They are functionally equivalent in most cases, except for some tutorials that will only work in the file system. Keeping installed resources in DAV has advantages when moving the database or backing up, since the installed items will not have to be treated separately. To install these on a database, login as dba using the isql utility and issue the SQL command:

# at the OS command line prompt, assuming the dba password is dba and the server is at the default port 1111:

$ isql 1111 dba dba 

--  At the SQL prompt:
SQL> vad_install ('conductor_dav.vad', 0);
-- The path is relative to the server's working directory. 

        

Since the packages are read from and sometimes the extracted contents are written to the file system, the server must have access to the directories in question. Checjk OS permissions and set the DirsAllowed parameter in the ini file to allow the needed access.

After the packages are installed, they can be used by pointing the web browser to their start page. The start pages are listed on the root page of the demo database and at the Getting Started section of the Open Virtuoso web site. The start page for the Conductor web admin interface is /conductor, the docs are at /doc/html, the tutorials at /tutorial, the BPEL admin and demos at /BPELGUI. All these are directly under the default Virtuoso web listener, the port is shown in the messages log and read from the HTTP Server section of the virtuoso.ini file.


6.1.1.3. Server Licensing

The Virtuoso server requires a valid license file before it will successfully start. The license file is a file always called virtuoso.lic and must reside in the working directory of the database instance - where the <database-name>.db file resides for the instance.

Evaluations versions of Virtuoso come without a license file, however the registration procedure takes your email address, which will be used to email a new license file for you evaluation.


6.1.1.4. Server Logging - Detecting Errors

Virtuoso provides extensive log information for resolving problems. Should any problems arise Virtuoso logs should always be consulted.

The Virtuoso server by default will send log information to two places, the appropriate system log for the operating system and the virtuoso.log file. On Unix based operating systems, including Linux, this information will appear in the system log files. On Windows the Virtuoso log information will appear in the Application Event Log.

Virtuoso logs information in the system logs before the Virtuoso.log file is open. This advantageously will log errors that cannot be placed in the virtuoso.log file, such as when the virtuoso.ini file cannot be located during Virtuoso startup.

The system log feature can be disabled using the "Syslog" parameter in the [Database] section. This is described in more detail in the following section.


6.1.1.5. Configuring Server Startup Files

6.1.1.5.1. Virtuoso Configuration File

The virtuoso.ini file is read from the directory that is current at server startup. This file contains an entry for all user settable options in the server. It is divided into the following sections:

Below are the descriptions for each parameter

[Database]

[<TempDatabase_Name>]

This section name must match the TempStorage parameter in the Database section of the Virtuoso INI file to be of any use. Otherwise this section will be ignored.


[Parameters]

Note: The default for startup behavior is to always read full extents and the default for the normal behavior is to trigger preread on the third read inside one second.


[HTTPServer]

Settings in this section control the web server component of the Virtuoso Server.


[URIQA]

The URIQA section sets parameters of both URIQA HTTP extension and URIQA web service. Section URIQA Semantic Web Enabler contains detailed description of this functionality, including more details about URIQA configuration parameters. This section should stay commented out as long as URIQA is not in use.


[SPARQL]

The SPARQL section sets parameters and limits for SPARQL query protocol web service service. This section should stay commented out as long as SPARQL is not in use.

Section
RDF Data Access and Data Management contains detailed description of this functionality.

The SPARQL INI can be get as RDF via http://cname/sparql?ini service.


[I18N]

[Replication]

The replication section sets the transactional replication parameters for the server.


[Mono]

[Client]

[AutoRepair]

[VDB]

[Ucms]

[Zero Config]
See Also:

The Zero Config section.


[Plugins]
See Also:

VSEI Plugins


[Striping]

This section is only effective when Striping = 1 is set in the [Database] section. When striping is enabled the Virtuoso database spans multiple segments where each segment can have multiple stripes.

Striping can be used to distribute the database across multiple locations of a file system for performance. Segmentation can be used for expansion or dealing with file size limitations. To allow for database growth when a file system is near capacity or near the OS file size limitation, new segments can be added on the same or other file systems.

Striping only has a potential performance benefit when striped across multiple devices. Striping on the same file system is needless and unlikely to alter performance, however, multiple segments do provide convenience and flexibility as described above. Striping across partitions on the same device is likely to reduce performance by causing high unnecessary seek times on the physical disk.

Database segments are pre-allocated as defined. This can reduce the potential for file fragmentation which could also provide some performance benefit.

Virtuoso striping alone does not allow for any fault tolerance. This is best handled at the I/O layer or by the operating system. File system RAID with fault-tolerant striping defined should be used to host the Virtuoso files if striping based protection is desired.

The segments are numbered, their segment <number> must be specified in order starting with segment1.

The <size> is the total size of the segment which is that will be divided equally across all stripes comprising the segment. Its specification can be in gigabytes (g), megabytes (m), kilobytes (k) or in database blocks (b) the default.

Note:

The segment size must be a multiple of the database page size which is currently 8k. Also, the segment size must be divisible by the number of stripe files contained in the segment.

Segments can be added to a database, however once defined they should never be altered. Databases that were created without striping cannot automatically be restarted with striping. You can convert a non-striping database to striping by dumping the contents of the database to a transaction file and replaying it with striping enabled. An on-line backup made with backup_online () can be restored on a database with a different striping configuration as long as the total number of pages is no less than the number of pages of the backed up database.

Striping can be useful for the temporary objects database if large hash join temporary spaces or such are expected. This is enabled by the Striping setting in the Temp Database section of the ini file. The stripes will be declared in the TempStriping section.

See Also:

Rebuilding A Database in the Backup section.



6.1.1.5.2. Sample Configuration File ("virtuoso.ini")

Following is the text of the sample virtuoso.ini file that comes with the distribution.

;
;  virtuoso.ini
;
;  Configuration file for the OpenLink Virtuoso VDBMS Server
;
;
;  Database setup
;
[Database]
DatabaseFile		= virtuoso.db
TransactionFile		= virtuoso.trx
ErrorLogFile		= virtuoso.log
ErrorLogLevel   	= 7
FileExtend      	= 200
Striping        	= 0
; LogSegments		= 1
; crashdump_start_dp	= 0
; crashdump_end_dp	= 0
; Log1			= log1.trx
;
;  Server parameters
;
[Parameters]
ServerPort         	= 1111
ServerThreads      	= 10
CheckpointInterval 	= 60
CheckpointSync 		= 2
NumberOfBuffers    	= 2000
MaxDirtyBuffers    	= 1200
MaxCheckpointRemap 	= 2000
PrefixResultNames	= 0
CaseMode           	= 1
;MinAutoCheckpointSize	= 4000000
AutoCheckpointLogSize	= 40000000
CheckpointAuditTrail		= 1

[HTTPServer]
ServerPort = 1122
ServerRoot = ../vsp
ServerThreads = 2
MaxKeepAlives = 10
KeepAliveTimeout = 10
MaxCachedProxyConnections = 10
ProxyConnectionCacheTimeout = 15
DavRoot = DAV

[Replication]
ServerName = log1
Server = 1
QueueMax = 50000

[Client]
SQL_QUERY_TIMEOUT 	= 0
SQL_TXN_TIMEOUT		  = 0
SQL_PREFETCH_ROWS		= 100
SQL_PREFETCH_BYTES	= 16000

[AutoRepair]
BadParentLinks	    = 0
BadDTP			        = 0

;[Ucms]
;UcmPath = /ucm
;Ucm1 = java-Cp933-1.3-P.ucm,Cp933
;Ucm2 = java-Cp949-1.3-P.ucm,Cp949|Korean
;Ucm3 = java-ISO2022KR-1.3-P.ucm,ISO2022KR|ISO2022-KR
;
;  Striping setup
;
;  These parameters have only effect when Striping is set to 1 in the
;  [Database] section, in which case the DatabaseFile parameter is ignored.
;
;  With striping, the database spans multiple segments
;  where each segment can have multiple stripes.
;
;  Format of the lines below:
;    Segment<number> = <size>, <stripe file name> [, <stripe file name> .. ]
;
;  <number> must be ordered from 1 up.
;
;  The <size> is the total size of the segment which is equally divided
;  across all stripes comprising  the segment. Its specification can be in
;  gigabytes (g), megabytes (m), kilobytes (k) or in database blocks
;  (b, the default)
;
;  Note that the segment size must be a multiple of the database page size
;  which is currently 8k. Also, the segment size must be divisible by the
;  number of stripe files constituting the segment.
;
;  The example below creates a 200 meg database striped on two segments
;  with two stripes of 50 meg and one of 100 meg.
;
;  You can always add more segments to the configuration, but once
;  added, do not change the setup.
;
[Striping]
Segment1	= 100M, db-seg1-1.db, db-seg1-2.db
Segment2	= 100M, db-seg2-1.db

6.1.1.5.3. Index Defragmentation

When data is inserted into tables, database pages are split and typically pages end up not being fully utilized. If data is inserted in ascending order of key value, which is often the case, space utilization is more efficient. Still, gaps can be left by updates, deletions and page splitting. Virtuoso has an autocompact feature which will take groups of adjacent dirty pages and see if it can fit the same content on a smaller number of pages. It will rewrite the pages to save space before writing these to disk. This operates automatically. Statistics on autocompaction are shown beside the Read ahead status line of the result set of status ('');. The number of pages affected by autocompaction and the number of resulting pages are shown. The numbers are cumulative since the start of the instance.

Autocompact is non-locking and if pages are busy, they will not be touched. Only relatively old dirty pages, about to be written to disk are considered for compaction, not interfering with the hottest part of the working set.

The automatic compaction is not however effective if pages are updated singly, never making stretches of consecutive dirty pages. Therefore a manual compaction function called DB.DBA.VACUUM () is also offered.

The vacuum stored procedure gets an optional table and index name and will read the index from beginning to end. If neither argument is given, all indices of all tables will be compacted. If only the table is given, all indices of this table will be compacted.

If consecutive leaves can be fit on fewer pages than they now occupy, this will rewrite them and free the pages left over. This does however require transient space since the pages are not really replaced until the next checkpoint, hence a vacuum operation can run out of disk. Using the checkpoint statement to force a checkpoint will free the space.

The effects and need for explicitly vacuuming a database can be assessed with the DB.DBA.SYS_INDEX_SPACE_STATS view.

Running:

select * from DB..SYS_INDEX_SPACE_STATS order by ISS_PAGES desc;

produces a result set with the most space consuming index on top. ISS_PAGES is the total count of pages. ISS_ROW_BYTES is the byte count of the rows. If dividing the total count of bytes by the count of pages is much below 8172, (8K - 20), chances are that vacuuming the index may save space. Note that blobs are not affected by vacuuming. If the blobs are small enough to fit on the row as normal strings they are already there. Otherwise they occupy the needed number of pages and cannot be made more compact.

Note that querying the SYS_INDEX_SPACE_STATS view will always read through all the allocated pages of the database and may take a while. The operation is not locking. Only the state as of the last checkpoint will be shown, hence it is a good idea to run the checkpoint statement before querying this view.

Examples:

DB..VACUUM ();
-- Compact all tables and indices of the Virtuoso instance
Db..VACUUM ('WS.%');
-- compact all tables of the WS. qualifier
DB..VACUUM ('DB.DBA.RDF_QUAD', 'RDF_QUAD_PGOS');
-- compact the rdf_quad_phgos index of the rdf_quad table.

Virtuoso has an autocompact feature.



6.1.1.6. Server Startup Command Line Options

6.1.1.6.1. Virtuoso Server

This section presents the command line switches of the Virtuoso server executable. Depending on the model and virtual database middleware the server will have different names, all starting with virtuoso-. All these however have the same options for UNIX systems and slightly different for Windows platform.

on Windows platform are available following server command line options:

Usage:
  virtuoso-odbc-t.exe [-clnCbDARf---dSIMKmrd] [+configfile arg] [+licensefile arg]
                  [+no-checkpoint] [+checkpoint-only] [+backup-dump]
                  [+crash-dump] [+crash-dump-data-ini arg]
                  [+restore-crash-dump] [+foreground] [+pwdold arg]
                  [+pwddba arg] [+pwddav arg] [+debug] [+service arg]
                  [+instance arg] [+mode arg] [+dumpkeys arg] [+manual]
                  [+restore-backup arg] [+debug]
  +configfile            specify an alternate configuration file to use,
			or a directory where virtuoso.ini can be found
  +licensefile           specify an alternate license file to use,
			or a directory where virtuoso.ini can be found
  +no-checkpoint         do not checkpoint on startup
  +checkpoint-only       exit as soon as checkpoint on startup is complete
  +backup-dump           dump database into the transaction log, then exit
  +crash-dump            dump inconsistent database into the transaction log,
			then exit
  +crash-dump-data-ini   specify the DB ini to use for reading the data to dump
  +restore-crash-dump    restore from a crash-dump
  +foreground            run in the foreground
  +pwdold                Old DBA password
  +pwddba                New DBA password
  +pwddav                New DAV password
  +debug                 allocate a debugging console
  +service               specify a service action to perform
  +instance              specify a service instance to start/stop/create/delete
  +mode                  specify mode options for server startup (onbalr)
  +dumpkeys              specify key id(s) to dump on crash dump (default : all)
  +manual                specify when create a service to make it for manual startup
  +restore-backup        restore from online backup
  +debug                 Show additional debugging info

The argument to the +service option can be one of the following options
  start         start a service instance
  stop          stop a service instance
  create        create a service instance
  screate       create a service instance without deleting the existing one
  delete        delete a service instance
  list          list all service instances

The below are switches for server for UNIX platforms:

Usage:
  virtuoso-iodbc-t [-fclnCbDARwMKr---d] [+foreground] [+configfile arg]
                   [+licensefile arg] [+no-checkpoint] [+checkpoint-only]
                   [+backup-dump] [+crash-dump] [+crash-dump-data-ini arg]
                   [+restore-crash-dump] [+wait] [+mode arg] [+dumpkeys arg]
                   [+restore-backup arg] [+pwdold arg] [+pwddba arg]
                   [+pwddav arg] [+debug]
  +foreground            run in the foreground
  +configfile            use alternate configuration file
  +licensefile           use alternate license file
  +no-checkpoint         do not checkpoint on startup
  +checkpoint-only       exit as soon as checkpoint on startup is complete
  +backup-dump           dump database into the transaction log, then exit
  +crash-dump            dump inconsistent database into the transaction log, then exit
  +crash-dump-data-ini   specify the DB ini to use for reading the data to dump
  +restore-crash-dump    restore from a crash-dump
  +wait                  wait for background initialization to complete
  +mode                  specify mode options for server startup (onbalr)
  +dumpkeys              specify key id(s) to dump on crash dump (default : all)
  +restore-backup        restore from online backup
  +pwdold                Old DBA password
  +pwddba                New DBA password
  +pwddav                New DAV password
  +debug                 Show additional debugging info

The +crash-dump option will make use of the segmented log defined in virtuoso.ini for storing the recovery log. See Crash Recovery and virtuoso.ini below for more information. The other options will not use the segmented log.

The +restore-crash-dump option will alter the server startup sequence so that the recovery log produced by +crash-dump will be re-played correctly.

The +mode option can be a combination of the following letters:

On Unix platforms the executable will detach itself from the console and run in the background as a daemon unless the +foreground switch is specified.

For Windows NT and Windows 2000, the Virtuoso server will normally be installed as a Windows service and can be started from the Control Panel or automatically at system startup.

Ordinarily the Windows service will be a system process that runs in the background. If you want the Virtuoso service on Windows to allocate a debugging console the you can use the +debug (-d) switch. This switch is only applicable to starting a service.

Virtuoso on Windows can be run directly from the command line using the +foreground (-f) switch. The server will then start in the foreground of the current "cmd" session. If this switch is not used then the executable on Windows will assume that you are attempting to start a Virtuoso service.

Windows services can be created and removed from the system as required. The default installation under Windows will create a service by the name: OpenLink Virtuoso VDBMS Server, and optionally another service with the name OpenLink Virtuoso VDBMS Server [demo]. The Demo service is a supplied demonstration database that can be installed.

The following options are available to the +service switch for configuring Virtuoso services:

They are used with the +instance <name> where <name> is the instance name to configure a particular instance. All instances are listed in the services applet, with their name in square brackets.

+service list can be used to obtain the list if services that are registered with Windows.

For each service listed you can start, stop, or delete the service.

+service create can be used to create a new service. In this case you also need to specify other start up options that would be associated with the new service entry. If you were using an alternative configuration file this must be specified using +configfile switch.

Note:

Make sure the Services Control Panel is closed, before attempting to modify services from the command line, otherwise locking may occur.



6.1.1.7. ZeroConfig ("Zero Configuration") Support

The "ZeroConfig" protocol, also known as "Zeroconf" or "Zero Configuration" is a protocol that allows discovery of services on the network that are advertised by their hosts. It also has provisions for automatic discovery of computers and various devices. The main benefit of ZeroConfig is that it does not require DHCP, DNS or directory servers.

ZeroConfig is an open protocol that Apple submitted to the IETF for a standard-creation process.

The Virtuoso server and ODBC driver use the capabilities of ZeroConfig to facilitate DSN (Data Source Name) setup and usage. This is divided in two parts: Server-side and Client-side.

The Virtuoso server (Server-side) is configured via the Virtuoso INI file to advertise its availability on a network with a given name. This allows applications, and in particular the Virtuoso ODBC driver, to receive information about a server, such as its network address, default login, etc, and use it for configuring a data source or directory making a connection.

The Virtuoso ODBC driver (Client side) uses ZeroConfig to locate the desired Virtuoso server during the set-up phase of a data source, and determine available connection options such as:

ZeroConfig provides the client with a service name, which must be bound to the IP address/port of a host of the chosen service during DSN configuration. This is used when existing DSN using a ZeroConfig name is used to connect, it will map name with IP address and port before making a connection.

6.1.1.7.1. Setting-up the Server for Service Advertising

The Virtuoso server is configured to advertise itself based on the details specified in the [Zero Config] section of the Virtuoso INI file. Below is an example of such:

...
[Zero Config]
ServerName    = Virtuoso Server
ServerDSN     = UID=demo;PWD=
SSLServerName = Virtuoso Server (via SSL)
SSLServerDSN  = UID=dba;PWD=;ENCRYPT=1
...

The ServerName and SSLServerName are human readable strings chosen by the administrator to provide clients with a suitable description of the service being provided.

Note:

If the Virtuoso does not have the SSL listener enabled then the SSL service will not be advertised automatically. The SSL* keys will simply be ignored and do not need to be removed.

The ServerDSN and SSLServerDSN are default connection strings that can be used by clients to make the advertised connection. You only need to to specify default username and password in these strings. The default database can be specified or left to the setting for the username. You cannot specify the server hostname, IP address or port number, these are supplied by Virtuoso automatically.

ZeroConfig service advertising is multicast, hence it is advertised on all available network interfaces.


6.1.1.7.2. Using the Windows ODBC Driver with ZeroConfig

Upon DSN set-up the ODBC driver listens for advertising servers, and compiles a roster. This is displayed for the user to choose the desired service to connect to.

If ZeroConfig is used to for data source set-up then the set-up dialog will be initialized based on the details in the connection string configured on the server.

When a DSN is configured based on a ZeroConfig service, the driver will resolve the service name before making the connection to the server. The driver does not store the network address or port number of the Virtuoso server, only the ZeroConfig server name, so if the server's physical address is changed the client DSNs associated with it do not all have to reconfigured; they will resolve to the new address automatically on next use.



6.1.1.8. Server Status Monitoring

The database status report is divided into 6 sections:

6.1.1.8.1. Server

This section shows how many connections are open and how many threads the process has and how many are running at the present time. This also displays the number of requests that have been received but are not yet running on any thread.


6.1.1.8.2. Database

  File size 203161600, 24800 pages, 259 free.
  7000 buffers, 6987 used, 3884 dirty 8 wired down, repl age 8251 .
  Disk Usage: 14246 reads avg 6 msec, 74% read last  10 s, 14457 writes,
    4 read ahead, batch = 5.
Gate:  2729 2nd in reads, 0 gate write waits, 3372547 in while read 0 busy scrap. 
Log = wi.log, 9851835 bytes
14950 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 18 connects, max 17 concurrent, 1024 licensed
RPC: 54441 calls, 17 pending, 17 max until now, 0 queued, 53988 burst reads (99%), 0 second 
Checkpoint Remap 7646 pages, 0 mapped back. 0 s atomic time.
    DB master 24800 total 259 free 7646 remap 3415 mapped back
   temp  200 total 196 free

The status consists of the following items:

File size:

The database file size in bytes or 0 if the database consists of statically allocated files. The total number of 8K database pages follows, then the number of free pages. The number of buffers shown the total count of 8K file cache buffers, followed by the number of used buffers and the number of buffers that are dirty at the time. The wired down count is normally zero but can be transiently other if pages are wired down for processing by threads in the server.

Disk Usage:

Shows the cumulative total number of reads and writes and the average length of time spent inside the read system call for the database files.

The percentage is the percentage of the real time spent inside read between this status report and the previous status report. This may exceed 100% if several reads are taking place concurrently on different stripes in a multi-file database.

The Gate:

Lists concurrent events. The 2nd in read is the count of concurrent requests for the same page, the gate write waits is the count of times a modify operation had to wait for exclusive access to a page being read by another thread, the in while read is the count of file cache hits that have taken place while a read system call was in progress on another thread.

Databases

Thus section shows the count of pages, free pages, checkpoint remap and mapped back for the main database and the space for temporary data such as sort results and hash indices. The page count is the total size, the free count is the count of free pages, the checkpoint remap is the count of pages that occupy two pages in checkpoint space instead of one, the mapped back count is the number of pages that will return to their original place in checkpoint space at the next checkpoint. Understanding these is not necessary.

See:

The Disk Configuration section for a discussion of checkpoint remapping.


6.1.1.8.3. Locks

The lock section shows various locking statistics accumulated since the server was started. The deadlock count is divided into deadlocks caused by a situation where several transactions read a page and one wants to get write access and all other deadlock situations. The first is called 2r1w deadlock in the report.

The lock section also shows the total number of threads running, i.e. engaged in performing some operation for a SQL or web client. The number of threads waiting is the number of running threads that are presently waiting for a lock. The number of threads in vdb is the number of threads engaged in remote database operations or other 'slow' I/O, such as access to outside HTTP or SOAP services.

All locks currently in effect are listed with the owners (a) and possibly waiting transactions. The transactions are named after their client. A log or replication replay transaction is here named 'INTERNAL'.


6.1.1.8.4. Clients

Each connected client is listed with the number of bytes sent and received from the client. The transaction status is either PENDING for OK or BLOWN OFF or DELTA ROLLED BACK for a transaction killed by deadlock or timeout. The locks owned by the transaction are listed following the status. IE means exclusive and IS shared lock.


6.1.1.8.5. Replication

This section shows the server in question and a list of replication accounts either provided or received by this server. Accounts where the server name (left column) is the same as the server name are those provided by this.

The columns are server name, account name, last transaction number and status. The status is OFF for a local account or a replicated account where the remote is not available. It is SYNCING if a resync is in progress, IN SYNC if the account is up to date or REMOTE DISCONNECTED if there was a connection to a remote party which subsequently disconnected.

See Also:

RDF Graph Replication


6.1.1.8.6. Index Usage

This part of the report summarizes the database's access statistics. The output is a table with a row for each index in the database. Each row is composed of the following columns:

Table		The name of the table
Index		The name of the index. Same as the table for primary key.
Touches		The number of touches since startup. Each time the database
		engine looks at an entry of the index is counted as a touch.
		Not all touched entries are selected. For instance if the
		engine scans a table with non-indexed selection criteria it
		will touch each row but might select none.
Reads		The number of disk reads caused by reading this index.
%Miss		The percentage of touches that required a read. This can be
		over 100% since getting one entry may required more than one
		read if the top levels of the index are not in memory.
Locks		The number of times a lock is set on an entry of the index.
Waits		The number of times the engine has to wait for another
		transaction to finish in order to set a lock on this index.
%W		The percentage of waits of all locks set.

In interactive SQL

SQL> status();

Will print out the report.



6.1.1.9. Re-labeling Server Executable on Win32 Platforms

The Virtuoso Service name can be altered using the key Win32ServiceName in the Parameters section. The default name is 'OpenLink Virtuoso VDBMS Server'

To change the name of services:

Services with old names must be deleted before creating service with the new name, i.e. with the Win32ServiceName setting set to the current name of the service.

The name displayed in the ODBC Administrator, Setup and Configuration dialogs is taken from the driver section in the ODBCINST. This can be directly edited in the registry using the regedt32 Windows utility, or a registry import file can be created which can be applied by simply double-clicking the .reg file. Always exercise extreme caution when making changes to the registry.


6.1.1.10. Transport Level Security

6.1.1.10.1. Encryption

Virtuoso has the ability to encrypt it's CLI network connections using SSL. The server listens on a separate port for SSL CLI connections and handles them just as the normal CLI connections, vut now providing transport level security.

Server-side Support

Server side secure connections utilizes three parameters in the [Parameters] section of the virtuoso.ini:

These parameters should be all set in order to enable the SSL CLI server.

If SSLServerPort is not specified, then the Virtuoso server ignores the other two and does not listen for SSL CLI connections.

If a non-SSL connection is attempted to the SSL server port, the server rejects the connection. If an SSL connection is attempted against the non-SSL port the server rejects the connection.


Client-side Support

The client does not require any SSL-specific files (like Certificates or Private keys) in the SSL connection process.

Native Clients (e.g. ISQL)

There is an custom ODBC connect option SQL_ENCRYPT_CONNECTION (=5004) supported by the Virtuoso CLI. It should be set before issuing the SQLConnect call. Values are 'NULL' (no encryption - default), '1' (encryption with no server X509 certificate checking and no X509 certificate sent to the server) and a valid file path to a PKCS#12 certificate file (protected with the same password as the one used to log in. Note that with the iODBC/ODBC clients this connect option is not applicable since the driver managers don't cache or pass through the custom ConnectOptions set before connecting to the data source. The ISQL has an additional option (-E) to do encrypted connects using the encryption option '1') and -X <file> to set the above option to the file supplied.

ODBC & iODBC Driver

The drivers support an additional DSN attribute:

ENCRYPT=<string>

If this attribute is not specified then it defaults to "No".

It has the same meaning as the SQL_ENCRYPT_CONNECTION options (see above). If this is not specified then it defaults to NULL.

The corresponding iODBC odbc.ini & ODBC Registry DSN attribute name is:

"Encrypt"= <string>

The Windows Connect & Setup dialogs have an additional wizard page to configure encryption.


X509 Certificate Support

Virtuoso supports X509 certificate validation: server side for both ODBC and HTTPS connections, and client side for the ODBC connections.

Server Side

To enable this option there are three new INI file parameters added the HTTPServer section for the HTTPS, and the Parameters section for ODBC):

ODBC Client Side

In order for verification of the server certificate to take place a PKCS#12 file should be supplied to the ODBC client. It will use the CA list in this PKCS#12 to verify the server certificate. It will set the verification depth to -1 (unlimited) while performing such a check.

If the server certificate is not verified correctly it will refuse to connect to the server. When the ENCRYPT parameter is set to "1" (do SSL without X509 validation) the client will return a SQL_SUCCESS_WITH_INFO in SQLConnect/SQLDriverConnect with the Server's certificate subject and the verification result as the server will always send it's X509 certificate to the client as a part of the SSL connect handshake.

If the PKCS#12 file is supplied the ODBC client will try to open it using the login password. In order for the file to be successfully opened it should be encrypted with the same password used for logging in.

Normally when exporting a PKCS#12 file from other programs it will contain only the CAs of the Certificate validation chain. This means that client and server certificates should have common CA in their certificate chains in order to be used for ODBC X509 validation. The client certificate from the PKCS#12 file will not take place in the server certificate validation process.

See Also:

get_certificate_info()



6.1.1.10.2. File System Access Control Lists

Access Control Lists (ACL) are used to restrict file system access.

These lists are maintained in the Virtuoso INI file under the Parameters section with entries such as:

DirsAllowed = <path> [, <path>]
DirsDenied = <path> [, <path>]

<path> := <absolute_path> or <relative_path>
Note:

A relative path is relative to the servers current working directory.

The Virtuoso ISQL utility can be used to check the Server DirsAllowed params as follows:

SQL> select server_root (), virtuoso_ini_path ();

The above should show in the result the server working directory and INI file name.

Also you can check the relevant INI setting by running following statement via ISQL command line utility:

SQL> select cfg_item_value (virtuoso_ini_path (), 'Parameters',
'DirsAllowed');
See Also:

Virtuoso INI File Configuration

ACL's work in the following way:

The following functions are restricted by file Access Control Lists (ACL) in the virtuoso.ini file:

Note:

the cfg_write function has restrictions against changing file access control lists in ini file




6.1.2. Virtual Database

6.1.2.1. Linking Remote Tables & Views

The Virtuoso Server supports linking in of tables, views, procedures and other components from a remote ODBC data-source, enabling them to be queried as native objects in Virtuoso; this process is called "attaching" or "linking". The easiest way to link to an external table is to use the Linking Remote Tables Wizard, part of the Visual Server Administration Interface. Alternatively you can attach these objects programmatically, as this section explains; finally you can attach tables manually - see Manually Setting Up A Remote Data Source which is useful for connections to less-capable ODBC data-sources.

6.1.2.1.1. ATTACH TABLE Statement
ATTACH TABLE <table> [PRIMARY KEY '(' <column> [, ...] ')']
  [AS <local_name>] FROM <dsn> [USER <uid> PASSWORD <pwd>]
  [ON SELECT] [REMOTE AS <literal_table_name>]
table:

Adequately qualified table name of the form: identifier | identifier.identifier | identifier.identifier.identifier | identifier..identifier

column:

column to assume primary key

local_name:

fully qualified table name specifying local reference.

dsn:

scalar_exp

user:

scalar_exp

password:

scalar_exp

literal_table_name:

scalar_exp

This SQL statement defines a remote data source, copies the schema information of a given table to the local database and defines the table as a remote table residing on the data source in question.

The table is a designation of the table's name on the remote data source dsn. It may consist of an optional qualifier, optional owner and table names, separated by dots. This must identify exactly one table on the remote dsn. The optional local_name is an optionally qualified table name which will identify the table on the local database. If qualifier or owner are omitted, these default to the current qualifier 'dbname()' and the logged in user, as with CREATE TABLE. If the local_name is not given it defaults to the <current qualifier>.<dsn>.<table name on dsn>. The <dsn> will be the dsn with all alphabetic characters in upper case and all non-alphanumeric characters replaced by underscores. The <table name on dsn> will be the exact name as it is on the remote dsn, case unmodified.

The PRIMARY KEY option is only required for attaching views or tables where the primary key on the remote table cannot be ascertained directly from the remote data source.

If a dsn is not previously defined with vd_remote_data_source or ATTACH TABLE, the USER and PASSWORD clauses have to be given.

The REMOTE AS option allows you to provide a string literal for referencing the remote table. This is useful when linking tables from sources that do not support three-part qualification correctly.


6.1.2.1.2. Attaching views

A view can be attached as a table if a set of uniquely identifying columns is supplied.

This is done with the PRIMARY KEY option to ATTACH TABLE as follows:

attach table T1_VIEW primary key (ROW_NO) from 'somedsn';
Note:

Views cannot be attached unless the PRIMARY KEY options is used.


6.1.2.1.3. Examples for Linking Remote Tables into OpenLink Virtuoso


6.1.2.2. Linking Remote Procedures

6.1.2.2.1. ATTACH PROCEDURE Statement
ATTACH (PROCEDURE|FUNCTION) <proc_name> ([<parameter1>[,<parameter2>[...]]])
  [ RETURNS <rettype> ] [AS <local_name>] FROM <dsn>
	
dsn:

scalar_exp

proc_name:

identifier | identifier.identifier | identifier.identifier.identifier | identifier..identifier

parameter1..parameterN:

parameters declaration (as in CREATE PROCEDURE)

local_name:

table

The ATTACH PROCEDURE statement allows you to associate stored procedures from remote datasources with Virtuoso so they can be used as if they were defined directly within Virtuoso. Much like the ATTACH TABLE statement, this SQL statement creates a local alias for a procedure on a given remote data source so it can be considered locally defined. When this alias is called called the procedure at the remote data source will actually be called.

Procedure generated result sets are not supported by the ATTACH PROCEDURE statement. The only portable way to return values from a remote procedure is to use INOUT or OUT parameters. Remote procedure result sets can be used by combination of rexecute() and Virtuoso PL, but this is left for the user to implement as required.

The ATTACH PROCEDURE statement is not able to define new connections to remote data sources, the connection should be defined prior using either vd_remote_data_source or by attaching a table or view using the ATTACH TABLE statement with USER/PASSWORD supplied.

Note that when generating pass-through statements to a given remote, any procedure call for an attached procedure is passed through if the current DSN is the same as the remote procedure's DSN.

The proc_name is the designation of the procedure's name on the remote data source, DSN. The remote procedure name supplied should always be fully qualified to avoid ambiguity, it may consist of an optional qualifier/catalog, optional owner and finally procedure name, separated by dots. This must identify exactly one procedure on the remote data source.

The optional local_name is an optionally qualified procedure name which will identify the procedure on the local Virtuoso database. If the local_name is not given it defaults to the <current qualifier>.<dsn>.<proc name on dsn>. The <dsn> will be the data source name in upper case and with all non-alphanumeric characters replaced by underscores. The <proc name on dsn> will be the exact name as it is on the remote dsn, case unmodified.

If a dsn is not previously defined with vd_remote_data_source or ATTACH TABLE, the ATTACH PROCEDURE will fail.

Example:

On remote Virtuoso (DSN name : remote_virt):

   CREATE PROCEDURE RPROC (IN PARAM VARCHAR) returns VARCHAR
   {
     return (ucase (PARAM));
   }

On the local virtuoso (DSN name : local_virt) :

vd_remote_data_source ('remote_virt', '', 'demo', 'demopwd');
ATTACH PROCEDURE RPROC (IN PARAM VARCHAR) RETURNS VARCHAR from 'remote_virt';

will result in creation of an procedure alias for RPROC in local_virt named DB.REMOTE_VIRT.RPROC

Calling it from the local_virt (using ISQL)

select REMOTE_VIRT.RPROC ('MiXeD CaSe') as rproc_result;

rproc_result
---------------
MIXED CASE
1 rows

See Also:

The Virtuoso Visual Server Administration Interface provides a graphical user interface for linking remote stored procedures.


6.1.2.3. Data Type Mappings

If a statement is passed through to a remote data source, the types returned by SQLDescribeCol are taken directly from the remote prepare and converted to the closest Virtuoso supported type.

If a statement involves both local and remote resources all types are taken from the Virtuoso copy of the data dictionary.

In executing remote selects Virtuoso binds output columns according to the type and precision given by SQLDescribeCol after preparing the remote statement.

When a table is attached from a remote data source the catalog is read and the equivalent entries are created in Virtuoso. Since the types present on different DBMS's vary, the following logic is used to map ODBC types to Virtuoso types.

Table: 6.1.2.3.1. Attach Table Type Mappings
SQL Type Mapped Type
SQL_CHAR varchar (precision)
SQL_VARCHAR varchar (precision)
SQL_BIT smallint
SQL_TINYINT smallint
SQL_SMALLINT smallint
SQL_INTEGER integer
SQL_BIGINT decimal (20)
SQL_DECIMAL
SQL_NUMERIC

smallint if precision < 5 and scale = 0

integer if precision < 10 and scale = 0

double precision if precision < 16

decimal (precision, scale) otherwise

SQL_REAL real
SQL_FLOAT double precision
SQL_DOUBLE double precision
SQL_BINARY varbinary (precision)
SQL_VARBINARY varbinary (precision)
SQL_LONGVARCHAR long varchar
SQL_LONGVARBINARY long varbinary
SQL_DATE date
SQL_TIME time
SQL_TIMESTAMP datetime

Note:

The general case of decimal and numeric both revert to the Virtuoso decimal type, which is a variable precision decimal floating point.


6.1.2.4. Transaction Model

One transaction on the Virtuoso VDBMS server may contain operations on multiple remote data sources. As a general rule remote connections are in manual commit mode and Virtuoso either commits or rolls back the transaction on each of the remote connections as the main transaction terminates.

ODBC does not support two phase commit. Therefore a transaction that succeeds on one remote party may fail on another.

A transaction involving local tables and tables on one remote data source will always complete correctly since the remote is committed before the local and the local is rolled back if the remote commit fails.

Note that even though the client to Virtuoso connection may be in autocommit mode the continuing connections will typically not be autocommitting.

A remote connection is in autocommit mode if the Virtuoso connection is and the statement is passed through unmodified. In all other cases remote connections are in manual commit mode.

Virtuoso supports 2PC - Two Phase Commit. See the Distributed Transaction & Two Phase Commit section for more information.


6.1.2.5. Virtual Database and SQL Functions

Different DBMS's support slightly different sets of SQL built-in functions with slightly differing names. For example, what is called substring on Virtuoso is called substr on Informix. Virtuoso allows declaring equivalences between local user-defined or built-in functions and user defined or built-in functions on remote servers. Knowing that a function exists on a remote database allows passing processing closer to the data, resulting in evident gains in performance.

To declare that substring is called substr on DSN inf10, you can execute:

db..vd_pass_through_function ('inf10', 'substring', 'substr');

The first argument is the name of the remote database, as used with attach table and related statements. If user defined functions with qualified names are involved, the names should be qualified in the vd_pass_through_function call also. If many qualified or unqualified forms of the name are in use, one should declare the mapping for them all.

To verify whether this declaration takes effect, one can use explain to see the query execution plan, for example:

explain ('select substring (str, 1, 2) from inf10.sample_table');

The declarations are persistent and can be dropped by using a last argument of NULL for a given function. The declarations are kept at the level of a DSN and not at the level of the type of DBMS because different instances can have different user defined functions defined.


6.1.2.6. Virtual Database and SQL Optimizer Statistics

If a query can be executed in its entirety on a single remote database, then optimizing this query is exclusively the business of the remote database, as it gets the whole query text. Virtuoso rewrites some things and suggests a join order but these are not binding on the remote database.

If a query involves tables from multiple remote databases or a a mix of local and remote tables, knowing basic SQL statistics on the tables is essential for producing a meaningful query plan. Virtuoso has information on indices existing on remote tables and if the remote table is attached from a known type of DBMS, Virtuoso may read the DBMS specific statistics at attach time.

Note that the statistics of the remote database should be up to date before attaching.

The function sys_db_stat can be used for forcing a refresh of Virtuoso's statistics on remote tables.

sys_db_stat (5, 0)

will go through all tables, local and remote. For local tables, it will update statistics with a 5 percent sampling rate and for remote tables it will refresh the statistics if the type of the host DBMS is among the supported ones. If the remote DBMS is of an unknown type, Virtuoso will take the count of the remote table and select the 1000 first rows to get a sample of data lengths and column cardinalities. This is not very precise but will be better than nothing.

In order to force a full read of a remote table for statistics gathering, one can use

db..sys_stat_analyze ('fully qualified table name', 0, 0);

The table name is case sensitive, with all qualifiers, as it appears in SYS_KEYS and other system tables. This will read the whole table.

Statistics on local as well as remote tables are kept in SYS_COL_STAT. One may look at this table to see the effects of remote statistics collection. In special cases, if a special effect is desired or the information is not otherwise available, as in the case of a very large table on an unsupported type of server, it is possible to manually update the contents of the table. Shutting down and restarting Virtuoso will force a reload of the statistics information.

Presently Oracle, Informix and Virtuoso are supported for direct access to the remote database's statistics system tables. It is possible to define hook functions for accessing this same information from any other type of DBMS. Please contact support for instructions on this.


6.1.2.7. Distributed Query Optimization

When a query contains mixes of tables from different sources, the compiler must decide on an efficient execution plan that minimizes the number of round trips to remote servers and evaluates query conditions close to the data when possible. Additionally, any normal query optimization considerations such as choice of join order and join type apply. See the section on SQL optimization and optimizer hints for more on this. Additionally, the SQL optimizer uses round trip time statistics for the servers involved in the query.

In the following examples, we use the tables r1..t1, r2..t1 and t1, of which r1..t1 is on a server close by, r2..t1 on a server farther away and t1 on the local server. The column row_no is a unique key and the string1 column is in indexed with 300 distinct values, the column fs1 has 3 distinct values. The tables all have 100000 rows. A round trip to r1 takes 10 ms and a round trip to r2 takes 100 ms.

Consider

select * from r1..t1 a, t1 b where a.row_no = b.row_no and a.fs1 = 'value1';

The compiler notices that 33333 rows will be selected from r1..t1 based on fs1. It will decide to read these into a hash table, causing one linear scan of r1..t1 with relatively few round trips. Then it will read t1 locally and select the rows for which there is a matching entry in the hash. This is slightly faster than doing 33333 random lookups of t1. If fewer rows were selected from r1..t1, the compiler would do a loop join with the local t1 as the inner loop.

The absolute worst plan would be a loop join with t1 as the outer loop, with 100000 round trips to r1.

Now, if many tables are accessed from the same data source, the compiler tries to bundle these together into one statement. Thus, for:

select * from r1..t1 a, r1..t1 b, t1 c where c.string1 = '111' and b.row_no = c.row_no and a.row_no = b.row_no + 1;

The compiler will probably do the outer loop for t1, which is expected to select 100000/300 rows. Then it will do a round trip to r1 with the statement.

select * from t1 a, t1 b where a.row_no = b.row_no + 1 and a.row_no = ?.

This is likely better than doing the remote part as an outer loop, bringing all the approx 100000 results in. 333 round trips selecting 1 row is better than 100000 rows transferred. If the data source were further away, this could be otherwise, hence the importance of the round trip time statistic.

In distributed queries, the compiler will honor the option (order) and the join types e.hg. table option *(hash) insofar the tables are local.

Thus, if we wrote

select * from r1..t1 a, t1 b, r1..t1 c where c.string1 = '111' and b.row_no = c.row_no and a.row_no = b.row_no + 1 option (order);

the compiler could not merge the two tablesfrom r1 into a single query because the order were given and there is an intervening table not located on r1.


6.1.2.8. Use of Array Parameters

ODBC and other data access API's usually offer a mechanism for executing a single parametrized statement multiple times with a single client-server round trip. This is usually called support of array parameters.

Virtuoso can make use of array parameter support in ODBC drivers when available. Consider the statement:

insert into r1..t1 select * from t1;

Without array parameters, this would make a round trip to r1 for each row inn t1. With array parameters enabled, with a batch size of 100, this would make only 1000 round trips for 100000 rows, resulting in dramatic increase in performance. Typically, if the remote server is on the same machine, array parameters make such batch operations about 3x faster. If the remote is farther away, the gain is greater.

Array parameters are used if the remote database and its ODBC driver support them. The feature is globally disabled in the default configuration because some ODBC drivers falsely claim to support array parameters. To enable this feature, the the ArrayOptimization entry in the [VDB] section of the ini file to 1. To set the batch size, use the NumArrayParameters setting. 100 is a reasonable value.

Some ODBC drivers also support array parameters for select statements. To enable using this, you can set the ArrayOptimization setting to 2. This may however not work with some drivers even if DML (insert/update/delete) statements do work with array parameters.


6.1.2.9. Timestamps & Autoincrement

A transaction timestamp is not the same across the transaction if the transaction has branches in different databases.

The data type and precision of a time stamp will also vary between different types of databases.

Hence timestamp columns coming from tables on different servers are not comparable for equality.

In inserts and updates of remote tables timestamps are assigned by the database where the table in question is physically located.

Identity or autoincrement columns are likewise handled by the database holding the remote table.

Note that MS SQL Server and Virtuoso describe a timestamp column as a binary column in ODBC catalog and meta data calls. Thus remote SQL Server or Virtuoso timestamps will not appear as timestamps at all.

In the case of a Virtuoso remote database the binary timestamp can be cast into a DATETIME data type and it will appear as a meaningful datetime.


6.1.2.10. VDB Stored Procedures & Functions

These procedures allow you to manually manage remote data sources and their tables.

Functions capable of returning a result-set make use of the results_set parameter. To prevent them from returning a result-set, the results_set parameter should be set to 'null'. If Virtuoso finds an awaiting parameter to contain results_set it will fetch the result set regardless of cursor_handle parameter.

Unless explicitly granted, only the DBA group is permitted to use the rexecute() to maintain security. Caution is required here since any user granted use of rexecute() has full control of the remote data source set-up by the DBA, however limited to the overall abilities of the remote user on the remote data source. Users can be granted and denied access to this function using the following commands:

GRANT REXECUTE ON '<attached_dsn_name>' TO <user_name>
REVOKE REXECUTE ON '<attached_dsn_name>' FROM <user_name>

The following remote catalogue functions help you to obtain information about the remote datasources that you are using. These could be especially useful in Virtuoso PL later on if you are not able to know everything about the remote tables ahead of time for the ATTACH TABLE statement


6.1.2.11. Manually Setting Up A Remote Data Source

Defining a remote table involves declaring the table as a local table and then defining the data source if not already defined and associating the new table with the remote data source.

The data source on which a table resides is declared at the table level. This has no connection to the table's qualifier.

Assume a remote ODBC data source named test containing a table xyz declared as follows:

Example:
   CREATE TABLE XYZ (
   A INTEGER,
	B INTEGER,
	C INTEGER,
	PRIMARY KEY (A));

To defined this as a remote table on the data source Virtuoso, first define the table locally, using the above CREATE TABLE statement above.

Then define the data source:

DB..vd_remote_data_source ('test', '', 'sa','');

And the table:

DB..vd_remote_table ('test', 'DB.DBA.XYZ', 'master.dbo.XYZ');

This assumes that the remote data source has a login 'sa' with an empty password and no special connection string options. The table names in vd_remote_table have to be fully qualified. We here assume that the Virtuoso table was created by DBA in under the default qualifier DB and the remote XYZ was created by dbo in master.

The vd_remote_table declaration does not affect statements or procedures compiled prior to the declaration.

Additional indices of remote tables may optionally be defined. They do not affect the operation of the SQL compiler. The remote data source makes the choice of index based on the order by clause in the statement passed through.


6.1.2.12. Caveats

When Virtuoso interacts with a table or view attached from a remote data source, it must be able to uniquely identify each row of the query. At the attach time Virtuoso will query remote data source for the tables primary keys and indices. These will be used to construct a copy of the table definition in Virtuoso which is then used in reference to the remote data source. At query time this information is used as much as possible. This information may need to be supplemented by calls to SQLStatistics() for further indicies or primary key information, as a last resort Virtuoso will use SQLColAttribute() to determine which columns are SQL_DESC_SEARCHABLE.



6.1.3. Virtuoso User Model

The Virtuoso User Model is designed to support:

There is a set of functions for administering the users and groups (roles) of a Virtuoso database. All the user administration functions are restricted to members of the dba group only.

Note:

The terms 'role' and 'group' are identical in their usage in this document. The terms security object or grantee refer collectively to users and groups.

User manipulation functions do not generally return values. In case of error an error condition will be signaled.

Users and roles are in the same namespace and are identified by name. One name will not both designate a user and a group. A user or group may be SQL enabled. A name that identifies a role (group) does not have a password and is not a valid login name.

A user may belong to multiple roles. A role may inherit multiple other roles. One role may be inherited or directly granted to a security object multiple times. The order of granting is not relevant. The effective privileges of a user are those granted to public, those granted to the user plus those granted to a direct or indirect role of the user. The relationship between users and roles is many to many. A role can be granted multiple other roles as well as direct permissions. Cycles in granting roles are not allowed, an error will be signaled if an operation would result in a cycle in the role inheritance graph.

When SQL procedures or queries are executing, the effective privileges are those granted to the owner of the procedure or view. A top level dynamic SQL statement executes with the effective privileges of the logged in user. If a SQL statement executes as a result of an HTTP request, the applicable virtual directory specifies on behalf of which SQL account the SQL is executed. A role cannot be the owner of procedures, views or other executable SQL entities.

6.1.3.1. Security Objects Management

The following functions allow for creation and deletion of security objects and roles, and for assigning or removing roles from security objects:

The security objects and roles data are contained in the system tables described in the User System Tables Section of the Appendix


6.1.3.2. User Options

The functions for setting/getting these options will accept any other named values, the above list only being those reserved for Virtuoso so far.


6.1.3.3. Login Extension PL Hook

DB.DBA.USER_FIND (in name varchar);

This is a user-defined PL function hook which, if it exists, will be executed before doing the SQL/ODBC login. In this hook the user can find a user account from some other server and register it in the local database. Or, this can be used to perform some pre-login actions. It is similar to the DBEV_LOGIN, but it does not change any account validation rule, it is purely for pre-processing.

See Also:

The Database Event Hooks chapter.

6.1.3.3.1. PL Hooks Examples
Querying an LDAP Server
create procedure DB.DBA.LDAP_SEARCH (inout user_name varchar, in digest varchar)
{
  whenever sqlstate '28000' goto ldap_validation_failure;
  if (lcase(user_name) <> 'dba')) 
    {
      ldap_search('foo.bar.com', 0, 'o=organization', '(cn=a*)',
		sprintf('uid=%s,ou=users,o=organization', user_name), pwd_magic_calc(user_name,digest,1));
      user_name := 'dba';
      return 1; -- force validation as dba
    }
  else 
    {
      -- bypassing ldap authentication for dba, let validation occur normally
      return -1; 
    }

ldap_validation_failure:
  return -1; -- try to validate normally
}
create procedure DB.DBA.DBEV_LOGIN (inout user_name varchar, in digest varchar, in session_random varchar) 
{
  declare get_pwd varchar;
  get_pwd := user_get_option (user_name, 'PASSWORD_MODE'); 
  if (get_pwd is not null)
    {
      declare rc integer;
      rc := call (get_pwd) (user_name, digest);
      return rc;	  
    }    
  return -1;
};
user_create ('test_ldap', 'secret', vector ('PASSWORD_MODE', 'DB.DBA.LDAP_SEARCH'));
USER_FIND PL Hook Example
create table 
MY_DBA_USERS (M_NAME varchar primary key, M_PASSWORD varchar);
create procedure 
DB.DBA.USER_FIND (in name varchar)
{
  -- do nothing for existing users
  if (exists (select 1 from SYS_USERS where U_NAME = name))
    return;
  -- if there is in custom table
  if (exists (select 1 from MY_DBA_USERS where M_NAME = name))
    {
      declare pwd varchar;
      -- get password
      select M_PASSWORD into pwd from from MY_DBA_USERS where M_NAME = name;
      -- create a new SQL user based on external data
      USER_CREATE (name, pwd, NULL);
    }
};


6.1.3.4. SQL Role Semantics

The terms user group and role are used interchangeably. Roles can be nested. There is a many to many relationship between users and roles. There is likewise a similar, acyclic many to many relationship between roles. Each role has a component role list of its granted (parent) roles, recursively, no cycles allowed.

All role grants are listed in the roles system table whether they be explicitly granted or only as a result of granting a group with groups granted to it. The role grant graph has an explicit edge for each role membership, direct or otherwise. The GI_DIRECT flag is true if the grant is direct. Only direct role grants can be revoked.

See Also:

The role system table description can be found in the appendix under System Tables.

The following SQL statements deal with roles. To create a new role (group) object the following statement can be used:

CREATE ROLE <NAME>

The <NAME> is a name of role to be created. It must be unique in space of all security objects.

Creating a security role
SQL> create role admins;

Use the following statement to remove an existing role from the security schema.

DROP ROLE <NAME>
Removing a security role
SQL> drop role admins;

The GRANT and REVOKE statements are used for controlling role membership as follows: To assign a new group, or list of groups (<ROLE>,...) to user <USER> use:

GRANT <ROLE> [, <ROLE>] TO <USER> [WITH ADMIN OPTION];

If the admin option is specified, the grantee can grant this same privilege further to other grantees.

Roles can be revoked using:

REVOKE <ROLE> [, <ROLE>] FROM <USER>;
Granting & revoking security roles
SQL> grant admins, users to demo;
SQL> revoke admins from demo;

Only the dba group accounts may administer roles.

The dba group is not physically a role. When an empty database is created, it will have the dba account with full privileges. To grant these same full privileges to another user, the dba uses the grant all privileges to <grantee>. statement. This will give the grantee full dba privileges, or in other words, add to the dba group. This may be reversed with the revoke all privileges from <grantee> statement.

The GRANT statement accepts any valid SQL security object in the TO clause. One cannot log in or perform any operations as a role. Roles are exclusively shorthand for specific sets of permissions which are changed together and are needed for multiple users.



6.1.4. VAD - Virtuoso Application Distribution

VAD provides a package distribution framework for installation, management, dependency checking and un-installation of Virtuoso applications. A VAD package contains all required Virtuoso components, which would constitute an application or hosted solution, within a single distributable file. A VAD package cannot contain any system parts independent of Virtuoso thus excluding operating system executables, shared objects, installers or settings.

Virtuoso and VAD provide the following abilities:

6.1.4.1. Summary of VAD Operations

The following is what the dba needs to know about VAD packages.

A VAD package is installed from a file with the db..vad_install SQL function. The first argument is the file path, which must be in a location that the server process can open, i.e. it is in the DirsAllowed list in the virtuoso.ini file. The second argument is 0, meaning that we are installing from a file.

SQL> vad_install ('conductor_dav.vad', 0);

is an example. If the package installation fails, the server exits and will have to be restarted. No effects of a failed installation will remain in the database after restart. Contact the supplier of the VAD package for further instructions.

To know what is installed, do:

SQL> vad_list_packages ();

VAD package installations are not recorded in the transaction log. Thus, if there is a backup followed by archived transaction logs produced if CheckpointAuditTrail is on in virtuoso.ini, the VAD install must be performed before replaying any logs that were made after the VAD installation. The package installation must be just in the right place in the replay sequence. In practice it is simplest to make an incremental backup after installing and packages, see backup_online () or the section on backing up.

For any further information, including how to make VAD packages, see the rest of this chapter.


6.1.4.2. VAD Package Composition

A VAD package has no developer tie-ins; it is built in a development environment from source code that can be managed and versioned in the developers system of preference.

The VAD package is described by an XML structure called the 'VAD Sticker'. The VAD sticker describes items to deploy, procedures to execute at install and uninstall time and all conditions to be checked to ensure correct installation. The VAD Sticker consists of the following:


6.1.4.3. Package Versioning

All required packages should be listed in the VAD sticker. Known conflict may be listed in either of the conflicting VAD packages stickers, hence VAD stickers of all installed packages should be checked.

Later versions of a package may be installed replacing earlier versions of the same package. This however can be prohibited by listing either version (or limit) as a known conflict in either VAD package sticker in the usual way. Furthermore, it is possible to prevent re-installation of a package by stating that it conflicts with itself. This provides some security against exploits involving attempts to upgrade, downgrade or re-install a package, in the hope that the administrator may corrupt the existing installation by installing new packages and working through installing their dependencies.

Packages may differ in language and encoding of documentation and resource files, even though the version number remains the same. If a package is sensitive to internationalization issues, the developer should either assign different names to various localizations of the package, or divide it into kernel package for any language-independent parts and set of language-specific packages, with some dependency between them.


6.1.4.4. Processing of Resources

During creation of a VAD package, the "location" mentioned above may be name of a file in file system or URI or DAV path. Upon package-time, URIs will be resolved and resources under them will be copied into the package. The resulting sticker will thus contain the location of resource within the package, the resource itself, and the target location.

All SQL files have a specific order of loading. Tables, views etc. must be defined before being referenced.


6.1.4.5. Unsupported Features of VAD

The VAD specification explicitly does not define the following:

Method of development or environment

There are no specific restrictions for the schema or Virtuoso/PL code of the package. The VAD system does not make assumptions on the method of software development.

Method of source code control or versioning

Version numbers used in the sticker have nothing common with tag labels in a developers versioning system. Procedures edited directly within the database using a web interface or CASE tools should be exported to a file for inclusion in a VAD package. If the application developer uses some script to export such code, this script is not usually part of sticker or the resulting package.

Shipping/Deployment the VAD package from vendor to user

VAD provides no methods for downloading dependent packages, or check for package updates etc.

Concurrent running of multiple versions of the same VAD package on a single server

There can be no guarantee that pre- or post-installation checks will provide valid results if more than one VAD is being processed at the same time. VAD does however guarantee that a package installation will be either entirely successful or entirely rolled back.

Installation or maintenance of non-Virtuoso hosted components

Unlike Virtuoso-based packages, these components are usually operating system specific, they may require some complex tuning, and their usage from within Virtuoso applications may even require changes in virtuoso.ini configuration file. VAD packages may contain test calls in pre-installation SQL procedures to check that required external executables are available and provide the functionality required.

Data migration

Some installations may require several days to complete migration/conversion of stored data. Whilst it may be possible to provide a restricted service during such time, VAD contains no tools to simplify such a process, this is left to the administrator or developer. VAD completes its work right after the execution of the post-installation code.

Synchronous installation of a package on all hosts of a distributed system or cluster

VAD has no standardized metadata regarding replication issues, hence package-specific code may be required. Similarly, if a cluster uses "round-robin" or a "director" loading management system and the server should be stopped for VAD installation, the administrator should explicitly inform the cluster manager about this event.


6.1.4.6. Security

Since VAD packages are run by an administrator as the database DBA user, care must be taken to ensure the package comes from a safe source. Any new package installed may violate the security regulations of the target database and may even inflict damage to files under the web-root of the Virtuoso Server or in directories specified in the "DirsAllowed" parameter of the virtuoso.ini. If the virtuoso.ini parameter "AllowOsCalls" is enabled then the installation procedures of the package may call operating executables. It is the responsibility of the database administrator to control this via the "AllowOsCalls", "SafeExecutables" and "DbaExecutables" parameters of the virtuoso.ini.

VAD packages do not offer any automatic protection against unauthorized modifications. Although ever VAD package contains a checksum, its purpose is to guard against data transfer errors, it may not be sufficient to detect unwanted modification.


6.1.4.7. Building VAD Packages

Initially, the VAD sticker and resources may reside in the file system, DAV directory and or other locations available through the DB.DBA.HTTP_URI_GET() function.

The VAD creation operation parses the VAD sticker's XML description and constructs the VAD file by calling DB.DBA.VAD_PACK.

This function reads the VAD sticker identified by the sticker_uri which contains the vad:package root element. Then the resources identified in the sticker are retrieved. All resource URIs are interpreted in the context of the base_uri_of_resources and are parsed and checked to be syntactically correct. Resources are appended to generated package that will be stored at the package_uri. DB.DBA.VAD_PACK returns a human readable log of error and warning messages, it will signal errors if any resource or database objects are unavailable at build time.

By convention, VAD package files have the extension '.vad'.


6.1.4.8. VAD Utilities

An optional VAD package named VADutils provides various tools for capturing changes made in the database after some point in time. The result of a capture consists of:

The capture results may be useful for the following purposes:

These mechanisms provide good support for centralized development and custom deployment methodology. If a site is localized to contain local links, graphics, custom layout and such, then VAD capabilities offer help to the developer to define the specific overlay of customizations over another VAD package. When the underlying VAD package is updated the local customizations will be overwritten. Being saved in a VAD package, customizations can be reapplied over the updated base package.


6.1.4.9. VAD Administrator Responsibilities

VAD package installation, upgrade and uninstallation requires a temporary break of service. The package checks may be performed on the fly if it can be guaranteed that the resources being inspected will not be altered by any users. The package check is a read-only process and operates solely within the VAD Registry using read-only functions.

All VAD operations are logged in the server event log. All completed operations are reflected in the DB.DBA.VAD_HISTORY system table.

The optional VADutils package provides some additional administrative tools, mostly for troubleshooting. These include special installation and de-installation functions that can ignore error signals, and provide an interactive editor for the VAD Registry etc.

All operations described below require DBA access to the database.

Check if a VAD package may be installed by calling DB.DBA.VAD_CHECK_INSTALLABILITY.

Checks the presence and correct versions of required packages and of the Virtuoso platform. It does not executes any pre-install Virtuoso/PL code from the package, so there's no guarantee that installation will be successful if the check found no error. If package_uri is DAV path, is_dav=1, else is_dav=0.

Perform VAD Package Installation by calling DB.DBA.VAD_INSTALL.

If package_uri is DAV path, is_dav=1, else is_dav=0.

The administrator performs the following operations when installing:

The return value of the DB.DBA.VAD_INSTALL() function is usually a sum of messages from pre- and post-installation procedures of the package. It should normally contain at least the following:

The VAD packages should be tested to install on an empty Virtuoso database, after any required VAD packages. Installing a package on an empty server is useful for determining that no other procedures or components were missed. Since the application would normally run on the development machine where the VAD package was built, it can be easy to overlook some components. The completeness of the source archive of the application and its independence from any ad hoc SQL objects is important, this is the only way the package can be reliably versioned, tracked or uninstalled.

Check if a VAD package may be uninstalled by calling DB.DBA.VAD_CHECK_UNINSTALLABILITY

Performs a preliminary read-only checks to see whether the package given can be uninstalled. This does not execute any pre-uninstall Virtuoso/PL code from within the package at this stage. Hence, the success of this function does not guarantee that uninstallation will be successful.

Perform VAD Package Uninstallation by calling DB.DBA.VAD_UNINSTALL.

The administrator will perform the following operations for the uninstallation process:

Check the state of VAD package installation by calling DB.DBA.VAD_CHECK.

This checks to see if the elements of the package are as they are defined in the original distribution. A list of differing elements is returned. Differences revealed may not indicate a corruption, such changes could have been made intentionally by another package, possibly a later version or upgrade that added some columns to tables, and some resources may be customized by the user post-installation.

This will check for the prior existence of tables, views etc owned by other applications that are not compatible with this application. Any such schema objects found are listed, the installation will not continue. These may be dropped by the DBA to help the installation to succeed. Some such elements may not be part of some other package, hence no package uninstall would be available leading the DBA to drop them with the appropriate SQL commands.


6.1.4.10. Package Overlap

Each package contains full definitions of all tables and indices. Upon installing the following outcomes can occur:

Thus the same SQL schema can be loaded twice without ill effect.

The post install script should be used to populate tables and such. Inserts should be executed using the insert soft statement so that attempts to insert duplicate are silently ignored without causing the installation to fail. The post install script can perform any application level data format changes.

Packages should define their own distinct catalog or qualifier. They should not overwrite another package unless upgrading a prior version. Sometimes a package will require the use of another package's tables. This should be achieved via grants issued in a pre-install script. A schema element such as a table, view or procedure will always have at most one owner package even though it may be referenced or even modified with additional columns by another package installed later. These elements will only be dropped when the owner package is dropped. Tables created ad-hoc from interactive SQL do not have any owner package.


6.1.4.11. VAD Sticker

The VAD Sticker contains meta-data and descriptions of resources contained, or to be contained, within a VAD package. Like any XML documents, the target VAD package sticker can be sourced from more than one source file, which can aid maintenance and development.

6.1.4.11.1. VAD Sticker DTD

The namespace vad, used below, represents the URI http://www.openlinksw.com/urn/vad.

The top level element of a VAD Sticker is <sticker>. It must contain a <caption> element and may contain <dependencies>, <procedures>, <ddls> and <resources> elements.

<!--<<top>>-->
<!ENTITY % vad.source_sticker "INCLUDE">
<!ENTITY % vad.package_sticker "IGNORE">
<!ENTITY % vad.ecm.group_content "(dependencies | procedures | ddls | resources | registry)" >
<![%vad.source_sticker;[
  <!ENTITY % vad.ecm.sticker "(caption, (group | %vad.ecm.group_content;)*)">
  <!ELEMENT group ((group | %vad.ecm.group_content;)*) >
  ]]>
<![%vad.package_sticker;[
  <!ENTITY % vad.ecm.sticker "(caption, %vad.ecm.group_content;)">
  ]]>
<!ELEMENT sticker %vad.ecm.sticker; >
<!ATTLIST sticker
  version     NMTOKEN #REQUIRED
  xml:lang    CDATA   #REQUIRED
  >
    <!--<</top>>-->

    <!--<<caption>>-->
<!ELEMENT caption (name, version)>
<!ELEMENT name ((prop)*)>
<!ATTLIST name
  package NMTOKEN #REQUIRED
  >
<!ELEMENT version ((prop)*)>
<!ATTLIST version
  package NMTOKEN #REQUIRED
  >
<!ELEMENT prop EMPTY>
<!ATTLIST prop
  name NMTOKEN #REQUIRED
  value CDATA #REQUIRED
  >
    <!--<</caption>>-->

The caption contains one name and one version element. These elements have a package attribute for keeping requisites used by VAD procedures. Other prop-s are for keeping admin-readable info, but they will not affect the installer's behavior. Typical names of properties here are Vendor, Copyright, Release+Date, Build, Language, Encoding, but any (even non-unique) names are acceptable.

Sticker's elements for dependencies

<!--<<dependencies>>-->
<!ELEMENT dependencies ((require | allow | conflict)*) >
<!ATTLIST dependencies>
<!ENTITY % vad.ecm.version_list "((version | versions_earlier | versions_later)*)">
<!ELEMENT require (name, %vad.ecm.version_list;) >
<!ELEMENT allow (name, %vad.ecm.version_list;) >
<!ELEMENT conflict (name, %vad.ecm.version_list;) >
<!ATTLIST require
  group NMTOKEN #IMPLIED
  >
<!ELEMENT versions_earlier ((prop)*)>
<!ATTLIST versions_earlier
  package NMTOKEN #REQUIRED
  >
<!ELEMENT versions_later ((prop)*)>
<!ATTLIST versions_later
  package NMTOKEN #REQUIRED
  >
    <!--<</dependencies>>-->

Element dependencies contains an list of packages related to given one. For every version or range of versions of every package, developer may specify whether the given version is required for the package, or allowed but not required, or will cause some sort of troubles.

More precisely, to find information about some particular version of a package, the list of children of dependencies element will be scanned from top to bottom. If the first matching record is in conflict group, not in require or allow, then installation is impossible. From other side, there must be at least one installed package for every require section.

Element require may be labeled with optional group attribute. As an exception from common rule, there must be at least one installed package for every group of require sections with identical name. E.g. If an installation of package B requires either of two interchangeable packages A1 and A2, sticker should contain a pair of nodes in the same group:

<require group="G">
  <name package="A1">...</name>
</require>

...

<require group="G">
  <name package="A2">...</name>
</require>
Note:

There are no methods to specify that exactly one package, either A1 or A2, should be installed. It must be done by placing proper conflict descriptions in stickers of A1 and/or A2, but not in the sticker of B.

Sticker's elements for procedures

<!--<<procedures>>-->
<!ELEMENT procedures ((sql)*)>
<!ATTLIST procedures
  uninstallation (supported | prohibited) #REQUIRED
  >
<![%vad.source_sticker;[
  <!ENTITY % vad.sql.include "include CDATA #IMPLIED">
  ]]>
<![%vad.package_sticker;[
  <!ENTITY % vad.sql.include "">
  ]]>
<!ELEMENT sql (#PCDATA)>
<!ATTLIST sql
  purpose (install-check | pre-install | post-install | uninstall-check | pre-uninstall | post-uninstall) #REQUIRED
  %vad.sql.include;
  >
    <!--<</procedures>>-->

Element procedures contains an list of Virtuoso/PL fragments, and every fragment is tagged by one of four values of the purpose attribute. At every stage of install or uninstall VAD procedure, a whole list of procedures will be scanned from the beginning to the end, and all procedures of appropriate sort will be executed in the same order as they are listed. In source sticker files, include attribute may be used to insert text of some external file instead of having SQL code written inside the element.

Sticker's elements for ddls

<!--<<ddls>>-->
<!ELEMENT ddls ((sql)*)>
<!ATTLIST ddls
  >
    <!--<</ddls>>-->

Element ddls is very similar to procedures and contains an list of Virtuoso/PL fragments to create schemas etc.

Sticker's elements for resources

<!--<<resources>>-->
<!ELEMENT resources ((file | location)*)>
<!ATTLIST resources >
<![%vad.source_sticker;[
  <!ENTITY % vad.file.source_uri "source_uri CDATA #IMPLIED">
  ]]>
<![%vad.package_sticker;[
  <!ENTITY % vad.file.source_uri "">
  ]]>
<!ELEMENT file EMPTY>
<!ATTLIST file
  type (doc | http | dav | code | special) #REQUIRED
  source (http) "http"
  target_uri CDATA #REQUIRED
  makepath (yes | no | abort) "abort"
  overwrite (yes | no | abort | equal | expected) "equal"
  package_id CDATA #IMPLIED
  location IDREF #IMPLIED
  dav_owner CDATA #IMPLIED
  dav_grp CDATA #IMPLIED
  dav_perm CDATA #IMPLIED
  %vad.file.source_uri;
  >
<!ELEMENT location ((prop)*) >
<!ATTLIST location
  id ID #REQUIRED
  default_target_uri CDATA #REQUIRED
  >
    <!--<</resources>>-->

Element resources lists all files to be copied onto target box. For every file, source and target URIs should be specified, and suggested behavior for cases when a directory should be created or file should be overwritten. Target URI may be relative to one of roots: for documentation, web-resources, DAV, SQL code (it's where virtuoso.ini is located) and one of special locations, additionally specified by location elements. (Installer may query administrator to allow changing of locations' roots; in such case, information from location's properties will be shown to the administrator.) By default, the value of package_id is a space delimited list of type, location ID (if any) and target URI.

VAD installable file descriptions

To install files into DAV:

<file type="dav" source="http" target_uri="yacutia/yacutia_style.css" dav_owner='dav' dav_grp='administrators' dav_perm='111101101N' makepath="yes"/>
<file type="dav" source="http" target_uri="yacutia/yacutia_vdir_style.css"  dav_owner='dav' dav_grp='administrators' dav_perm='111101101N' makepath="yes"/>

To install files into file system:

<file type="http" source="http" target_uri="yacutia/yacutia_style.css" makepath="yes"/>
<file type="http" source="http" target_uri="yacutia/yacutia_vdir_style.css" makepath="yes"/>

Sticker's elements for registry

<!--<<registry>>-->
      <!ELEMENT registry ((record)*)>
      <!ATTLIST registry >
      <!ELEMENT record ANY>
      <!ATTLIST record
        key CDATA #REQUIRED
        type (STRING | INTEGER | KEY | URL | XML) #REQUIRED
        overwrite (yes | no | abort | equal | expected) "equal"
        >
    <!--<</registry>>-->

Element registry lists all branches to be defined in the VAD Registry. Every record element contain data of one record. The first children of record element (either a text or an element) will be serialized and stored as a value of DB.DBA.VAD_REGISTRY.R_VALUE cell. To prevent errors, it is recommended to keep comments to the data outside the record element: being in the wrong place inside, they may be stored in the registry instead of actually needed data.

Sample Stickers

A package that contains only some commonly useful ("exported") functions, one table for internal purposes, a small sample VSP application, and small set of documentation files.

<?xml version="1.0" encoding="ASCII" ?>
<!DOCTYPE sticker SYSTEM "vad_sticker.dtd">
<sticker version="1.0.010505A" xml:lang="en-UK">
 <!-- Name and version; common data about the package -->
 <caption>
  <name package="rdf_lib">
   <prop name="Title" value="RDF Support Library" />
   <prop name="Developer" value="OpenLink Software" />
   <prop name="Copyright" value="(C) 2003 OpenLink Software" />
   <prop name="Download" value="http://www.openlinksw.com/virtuoso/rdf_lib/download" />
   <prop name="Download" value="http://www.openlinksw.co.uk/virtuoso/rdf_lib/download" />
  </name>
  <version package="3.14">
   <prop name="Release+Date" value="2003-05-05" />
   <prop name="Build" value="Release, optimized" />
  </version>
 </caption>
 <!-- This package requires no other packages,
but it conflicts with package virtodp of versions
from 1.00 to 2.17, inclusive -->
 <dependencies>
  <allow>
   <name package="virtodp"></name>
   <versions_earlier package="1.00"></versions_earlier>
  </allow>
  <conflict>
   <name package="virtodp">
    <prop name="Title" value="Virtuoso ODP Sample" />
   </name>
   <versions_earlier package="2.17">
    <prop name="Date" value="2001-01-26" />
    <prop name="Comment"
	  value="An incompatible version of RDF library is included in some old versions of virtodp " />
   </versions_earlier>
  </conflict>
 </dependencies>
 <!-- There are no installation procedures, other than DDLs -->
 <procedures uninstallation="supported"></procedures>
 <!-- There are some procedures, which may be re-applying and (maybe) reverted automatically -->
 <ddls>
  <sql purpose="pre-install">
   "DB"."DBA"."VAD_CREATE_TABLE" ('DB', 'DBA', 'RDF_SCHEDULED_IMPORTS',
      'ID integer,
	   URI varchar,
	   CALLBACK varchar,
	   VERSION varchar,
	   REPORT long varchar,
	   primary key (ID)');
  </sql>
  <sql purpose="post-install">
   "DB"."DBA"."VAD_LOAD_RESOURCE" ('rdf_functions');
  </sql>
 </ddls>
 <!-- Resources include... -->
 <resources>
  <!-- ...documentation, ... -->
  <file type="doc" target_uri="rdf_lib/1.1/intro.dxt" />
  <file type="doc" target_uri="rdf_lib/1.1/interface.dxt" />
  <file type="doc" target_uri="rdf_lib/1.1/implementation.dxt" />
  <file type="doc" target_uri="rdf_lib/1.1/sample_app.dxt" />
  <!-- ...the file of commonly-useful functions, ... -->
  <file package_id="rdf_functions"
    type="code" target_uri="rdf_lib/1.1/rdf_lib.sql" />
  <!-- ...pages of the sample application, named rdf_edit, ... -->
  <file type="http" target_uri="rdf_lib/rdf_edit/default.htm" />
  <file type="http" target_uri="rdf_lib/rdf_edit/browse.vsp" />
  <file type="http" target_uri="rdf_lib/rdf_edit/edit.vsp" />
  <!-- ...a DAV resource with sample RDF data, ... -->
  <file type="dav" target_uri="rdf_lib/sample_odp_structure.rdf" />
  <!-- ...two files of sample application's functions. -->
  <file type="code" target_uri="rdf_lib/1.1/rdf_edit/content_level.sql" />
  <file type="code" target_uri="rdf_lib/1.1/rdf_edit/view_level.sql" />
 </resources>
 <!-- There are no application-specific registry items in this package -->
</sticker>



6.1.5. Data Backup & Recovery

Administering a database involves taking backups and having a readiness to recover from backups and subsequent transaction logs.

Backups can be taken in two principal ways:

The Virtuoso backup functions can be used from any client directly, such as ISQL. It is possible, and perhaps preferable, to create stored procedures for performing the backup functions and scheduling these with the Virtuoso scheduler.

The actual database file(s) can be copied while the database is running so long as no checkpoint is made during the copy process. Checkpointing can be disabled for this, but make sure it is re-enabled after the backup has been made.

Making a full backup of a large database can take several hours if not days just due to the speed of tapes or local area networks. Full backups must in all cases be done without an intervening checkpoint. This is why frequent full backups are not desirable. To ensure the possibility of full recovery one must have the complete set of transaction logs (audit trail log) since the last backup.

Restarting the database after restoring backed up files will show the state in effect since the last checkpoint preceding the backup. Any transaction log files made after the point of backup can be replayed to bring the state up to the last recorded transaction.

6.1.5.1. Log Audit Trail

Virtuoso can maintain a transaction audit trail. This is enabled using the CheckpointAuditTrail setting in the virtuoso.ini file. When this setting is non-zero, Virtuoso will begin a new transaction log after each checkpoint. Thus one automatically gets a full, unbroken sequence of transaction logs for the entire age of the database. These logs are named as specified in virtuoso.ini and are suffixed with their creation time.

Transaction logs older than the log that was current at the time of the last backup are superfluous for recovery, since their transactions were checkpointed before the backup started. Transactions of the log current at the time of the backup are NOT in the backed up state since they were not checkpointed, i.e. written into the read-only section of the database containing the data being backed up.

We strongly advise having the CheckpointAuditTrail enabled in any production environment.

It is good practice to have at least two generations of full backup, since the last backup may contain errors that were not known at the time of its making. If such precaution is taken then only transactions logs older than the oldest backup are safe to remove. If we needed to recover from the oldest backup for any reason we would require all audit transaction logs created during and after that backup.

A Virtuoso database can be restored from the last full backup and all Audit Trail transaction files created during and after the backup. You would need to start the database as normal with the backup version of the database file. Once the database has been started, connect using iSQL. You can then use the replay() function to replay the transaction files up to the required point. It is vital that these files are replayed in the correct order.


6.1.5.2. On-Line Backups

6.1.5.2.1. Backup Using Backup_Online()

Virtuoso is capable of performing online backups so that normal database operation does not have to be disrupted in order to take backups. The backup_online() can be used to backup the database in the state effective at the last checkpoint to a series of backup files.

The database storage is divided into a checkpoint space that is a read only image from the time of the last checkpoint and thus can be safely backed up anytime between checkpoints and the commit space where updates subsequent to the last checkpoint data are stored. Additionally, the database records what pages have changed since the last checkpoint every time new checkpoint is made. This change tracking makes it possible to make incremental backups. The first time the backup_online function is called, it saves a compressed copy of the then current checkpoint state into one or more files. The next time it is called, it will write the changes that have come into the checkpoint space since the last time backup_online was called. It is possible to erase the change tracking data with the backup_context_clear function. The next call to backup_online will then make a full backup. Files generated by one or more calls to backup_online without intervening backup_context_clear form a series with distinct serial numbers and will be restored together. In order to restore such files, the administrator must delete the previous database files and start the server with a special flag and indicate the location of the backup files. This will bring the database to the state corresponding to the state as of the checkpoint immediately preceding the last call to backup_online, i.e. the one that wrote the newest of the backup files being restored. To restore onwards from this state, the administrator must replay transaction logs, starting with the log that was current when the last call to backup_online was made. In order to preserve all such logs, one must run with the CheckpointAuditTrail ini parameter set to 1.

A database checkpoint cannot be performed while an online backup is in progress.

Performing an Online Backup
SQL> backup_context_clear ();
SQL> checkpoint;
SQL> backup_online ('virt-inc_dump_#', 150);

The backup_online() procedure differs from the the CheckPointAuditTrail mainly because it can be started from any point in the database. Unless CheckPointAuditTrail was enabled when the database was created, the database file at a particular state and all transaction logs created by the AuditTrail since that state would be required to restore the database. Only the backup set files would be required to restore from backup_online(). The backup_online() also makes a compressed backup, making it far more suitable for large databases.

The last optional parameter allows to point the directory(ies) where the backup files must be stored. See backup_online() description for details.


6.1.5.2.2. Restoring From an Online Backup Series

To restore from a backup series the administrator must first shutdown the Virtuoso database server and move all database files (e.g. virtuoso.db and virtuoso.trx) out of the database directory. It is recommended that copies by taken rather than deleting them entirely. Then the command:

<virtuoso exe> +restore-backup <FILE_PREFIX>
-- for example:
virtuoso-iodbc-t +restore-backup dump-20011010_#

must be issued in the directory where the "*.bp" were stored. The database will then be restored. The expression <virtuoso exe> above must be replaced with the path and filename to the Virtuoso server executable used on your system (e.g. ..\virtuoso-odbc-t.exe).

Each file in the series has a header containing a unique identifier, for the backup set and the sequence number of the file in the backup set . If an identifier in any file in the backup sequence differs from the identifier contained in the first file, the restoration process will stop and report an error, which is written to the Virtuoso log file.

At times the backup or restoration commands may return errors. Use the following list to help diagnose and resolve them:

Restoring an Online Backup

Following the online backup example above:

SQL> backup_context_clear ();
SQL> checkpoint;
SQL> backup_online ('virt-inc_dump_#', 150);

The following command could be used to restore the database from the backup files created:

		virtuoso-iodbc-t -c <db-ini-file> +restore-backup virt-inc_dump_#
	

or:

		virtuoso-odbc-t.exe -c <db-ini-file> +restore-backup virt-inc_dump_#
	


6.1.5.3. Other Backup Methods

A possible way of making a full backup of a large databases is first to turn off any automatic checkpoints and make a compressed copy of the files. After the back up is complete, checkpointing should be re-enabled. The files should be compressed to make efficient use of space, and should be copied to a disk separate from the location of the database, and preferably to an external backup medium such as tape.

6.1.5.3.1. Manual Backup

For a large database it is best to turn off any automatic checkpoints and copy the database files to external storage. Checkpoints should be turned off by issuing the command:

checkpoint_interval (-1);

at the SQL prompt. Checkpoints can be re-enabled in a post-backup script by:

checkpoint_interval (<n>);

which sets the automatic checkpoint interval to <n> minutes. The backup will be unusable if there are checkpoints made while it is in progress. Thus it is important to guarantee that checkpoints do not occur. The only safe way of doing this is the above, since it is in principle possible to have a server crash during the backup and a roll forward following restart, all while the backup is in progress. If this happens the backup will be readable and consistent with the state of the last checkpoint if and only if there are no checkpoints between its start and completion. Setting the interval to -1 will guarantee that the server, when starting after recovery will not make a checkpoint.

The dba must make sure that clients do not issue checkpoint or shutdown statements while a backup is in progress.

The presence or absence of checkpoints at a given point in time can be ascertained from the virtuoso.log event log file.



6.1.5.4. Off-Line Backups

When Virtuoso is not running a complete and clean backup can be taken by making a copy of the database file and transaction file(s) created after the last checkpoint.

To get an up to the minute copy of a running database one can copy the database file and the associated log, i.e. the file specified in TransactionFile in the database's configuration file. When started, the log will roll forward and restore the database to the state following the last logged transaction.


6.1.5.5. Database Recovery

6.1.5.5.1. Rebuilding A Database

The process of rebuilding a database consists of dumping its contents into a large log file, or log files, and doing a roll forward from an empty database with that log.

The general steps to rebuild a database are:

Important:

It is recommended you take a backup copy of the database file(s)prior to this procedure.

It may sometimes be useful to rebuild a database as above to save space. Virtuoso does not relinquish space in the DB file back to the file system as records are removed, however, Virtuoso does reuse pages that are made available from a deletion of records. The steps above will build a new compact database file. You would ordinarily not have to worry about this.


6.1.5.5.2. Diagnosing and Recovering a Damaged Database File

It is possible to recover data from a damaged Virtuoso database by a procedure similar to rebuilding a database as described above. A database file may be corrupt if the database repeatedly crashes during a specific operation.

To determine whether a database is corrupt, you may use the backup to a null file command in isql, for Unix platforms:

SQL> backup '/dev/null';

For windows platforms you can use:

SQL> backup 'NUL';

This command will read through the database checking its integrity. If the server crashes before completing the backup process, then the database is indeed corrupt and needs to be recovered. No other activity should take place while the command is executing.

To recover the database, follow the procedure for rebuilding it, except use the -D 'capital D' or +crash-dump switch instead of -b. This will construct a log file which you can replay to make a new database. The database will contain the transactions that were committed as of the last successful checkpoint. If the database altogether fails to open it may be the case that the schema is damaged.

It is possible that the database to be recovered is too large to fit in a single log file. The crash dump feature therefore allows segmenting the recovery log into a number of files. See the virtuoso.ini configuration file documentation for details. It is possible to make a crash dump in several pieces if there is not enough total disk space to hold the dump on the system where the database is running.

If the recovery log is split over several files it is necessary to set the transaction file in the ini to point to the first of these files, delete the database file(s) and start the server with the +restore-crash-dump option. When the server comes online, one can connect to it with isql and use the replay () function for replaying the remaining logs, one by one, in their original order.

For example,assuming the virtuoso.ini fragment:

Log1	= rec-1.log 100M
Log2  = rec-2.log 100M

we would make the dump with

virtuoso +crash-dump

and once the server has been started with +restore-crash-dump, with the ini setting TransactionFile set to rec1.log, replay the remaining log with the isql commands:


SQL> replay ('rec-2.log');
Done.
SQL> checkpoint;
Note:

If the recovery is interrupted it can be restarted at the last checkpoint made during the recovery. Note that a mid recovery checkpoint may take a very long time, e.g. 1 hour for a 10GB database, since it is possible that the delta since the previous recovery checkpoint comprise almost all the database.


6.1.5.5.3. Crash Recovery When The Normal Crash Recovery Fails

When the schema tables (e.g. DB.DBA.SYS_COLS, DB.DBA.SYS_KEYS) have corrupt rows the normal crash dump/crash restore procedure will not be possible because the server relies on the schema tables to know the key layouts for reading the data rows of other tables upon crash dump.

In such situations there is a special procedure to be followed to save as much data as possible from the corrupt database. The general steps are:

Thus the transaction log produced from the corrupt database, when replayed on the new database file (the one holding the schema tables data) makes the closest approximation to the corrupt database's data. However, this will not produce a workable database by itself - it may possibly deny inserting of data into tables with IDENTITY columns and will lose all the data within the Virtuoso registry (accessible from registry_get()/registry_set() functions).

Because of the very nature of the crash-restore process described here and because of the fact that data is lost in the database schema, the server will not attempt to dump tables whose schema description is lost. So care should be taken when reading the data from the database.

This restoration procedure in no way replaces the regular database backup procedures, it merely tries to save whatever reasonable data there may be left from the database file.

The recovery sequence is as follows:

  1. Do a crash dump of the schema tables (using the '+crash-dump +mode oa +dumpkeys schema' virtuoso command line options).

  2. Create a new INI file to describe the layout of the new database you'll use to temporarily fill up the restored data.

  3. Move the transaction log file(s) produced in step 1 to the location required by the new INI file.

  4. Replay the transaction log from step 1 on an empty database using the new INI file. You will now have the schema tables readable in the new database (and nothing else):

    Virtuoso options : -c <your new ini file> +restore-crash-dump -f
    
  5. Make a crashdump of the data in the non-schema tables of the old database while having read the schema tables from the new database:

        +crash-dump -c <your new ini file> +crash-dump-data-ini <your old ini file> +mode o -f
    
  6. Move the transaction log file(s) produced in step 1 to the location required by the new INI file.

  7. Replay the transaction log from the previous step into the new db file using:

        -c <your new ini file> +restore-crash-dump
    
  8. Do a normal crash dump of the new database:

        -c <your new ini file> +crash-dump
    
  9. Move away (backup) the original (old) database files and put the transaction log produced by the above step into the location specified in the original INI file. You can also delete the rest of the DB files of the new database at that point.

  10. Replay the transaction log to make the old database afresh.

To automate the above procedure, a sample Unix script follows that automates it somewhat. This script expects the crashed database virtuoso.db.save and an appropriate INI file (no striping, no log segmentation, transaction log file name virtuoso.trx) in the current directory and creates the restored database. It also expects the virtuoso-iodbc-t executable to be in the operating system path. Also, make sure that you have a suitable virtuoso.lic license file in the current directory.

#!/bin/sh

rm -rf xmemdump.txt virtuoso.trx virtuoso.tdb virtuoso.log virtuoso.db virtuoso.lck core.* new.lck new.log new.trx new.tdb new.db new.ini
cp -f virtuoso.db.save virtuoso.db
cat virtuoso.ini | sed 's/virtuoso\./new./g' > new.ini

virtuoso-iodbc-t -f +crash-dump +mode oa +dumpkeys schema

ls -la *.trx
mv virtuoso.trx new.trx

virtuoso-iodbc-t -c new -f -R
virtuoso-iodbc-t -c new +crash-dump +crash-dump-data-ini virtuoso.ini +mode o -f

ls -la *.trx
mv virtuoso.trx new.trx

virtuoso-iodbc-t -c new -R -f

virtuoso-iodbc-t -c new +crash-dump -f

rm -f virtuoso.trx virtuoso.tdb virtuoso.log virtuoso.db virtuoso.lck
ls -la *.trx
mv new.trx virtuoso.trx

virtuoso-iodbc-t -R -f

6.1.5.5.4. Crash Recovery Across Virtuoso VDBMS Server Versions

If the database was created with a version prior to the one being used for rebuilding, the system tables may be different. The creation here refers to the first time the database was made, a crash recovery does not count as a fresh start here.

If this is or may be the case, the first log must be rolled forward into the empty database BEFORE the new and possibly incompatible system tables are created. This is done by setting the TransactionFile parameter to the first of the recovery logs and starting the server with the -R or +restore-crash-dump switch. For good practice one should also specify the no checkpoint switch, so that the log will in no case be damaged after the initial step of the roll forward. After this initial step the system tables will be compatible and the dba can proceed to replay the remaining recovery logs with the replay function.




6.1.6. Performance diagnostics

This section provides a checklist for improving performance of a Virtuoso server.

If something does not work fast, this is mostly for the following reasons:

Determining which is the case is simple. The result set returned by status ('') has most of the information. Do this twice with some 10 seconds between the samples and see the second result set.

6.1.6.1. Memory

If there is not enough memory, there will be frequent disk access. This is seen from the buffers and disk usage lines.

The very simplest test for this is looking at the CPU % of the process in top. If there is constant load and the percent is low then the server is IO bound.

If all memory is not in use, then memory cannot be the problem. This is seen from the buffers line. If the used number is under 80% of the total or if the replace age is several times larger than the count of buffers, then things are OK. If the replace age is 0 then no buffers have ever been replaced and all that ever was read is still in memory.

If the replace age is less than or close to the buffer count times 4, then cache replacement is frequent and adding buffers is advised.

Adding more than 60-70% of system ram as buffers is not useful. The setting is NumberOfBuffers in the ini file, count 9K per buffer.

The disk access is summarized on the disk usage line. First is the number of reads since start then the average latency in N ms. If %r is high, then a lot of the time between the previous status and this is taken by disk. This can be over 100. One thread that is waiting for disk all the time counts for 100. If the percent is high then adding more disks and striping over them will be useful. Even with a single database file, adding file descriptors (FDsPerFile setting in the ini) may be useful. If the average read latency is 0 or close, then the data is cached by the OS. If it is high, then adding disks and striping may reduce it.

The read ahead line will tell if there are sequential reads. These are faster than random ones and can efficiently use striped disks.

If the workload is random access, then a high number of read ahead means that one might not have the required indices, thereby causing full table scans. More on this in the query plans section below.


6.1.6.2. Swapping

Swapping is always bad. If swapping occurs, then one has too many buffers and should decrease the number of buffers. Use an OS tool like top to see the size of the database process and its virtual memory use. Having a resident size smaller than logical size is not always bad since some code or data in the process may simply be unused but having, after running in a steady state, i.e. all buffers used, a resident size less than the amount of memory allocated for buffers is always bad. Before steady state, i.e. during cache warm-up, the resident size is normally less than the buffer pool's size.

To see the count of major page faults. i.e. ones that read the disk, do:

select getrusage ()[]4];

through interactive SQL. The result is the count of major faults since starting the process. The count should not vary between samples, at least not more than a few units per minute.

This function is not available under Windows. Use the task manager instead for tracking this.

There can be an actual memory leak, specially with plugins or hosting. See the growth of the virtual size throughout the run, reaching full buffer utilization and doing a couple of checkpoints. Past this, the process should not grow. You may also see if the ThreadCleanupInterval or ResourcesCleanupInterval ini parameters have an effect. If in spite of all the process grows there can be a genuine leak. It is normal for the size to fluctuate somewhat due to varying amounts of uncommitted data, threads, connections and the like.


6.1.6.3. Locking

The lock status section indicates the count of deadlocks and waits. If the 2r1w number is high then it means that the application should read for update. It gets shared locks and cannot change them to exclusive. Adding the for update option to selects in the right place will fix this, also setting DefaultIsolation in the ini to 2 for read committed will be good. If the lock wait count increases fast, then locking may downgrade performance. The threads line below shows how many threads have some task to do. The waiting is the number of these threads that are waiting for a lock at the time of the status. The vdb number is the number of threads waiting for network io, either from virtual database or any web protocol.

Take a few samples and if these show few threads waiting and the waits or deadlocks counts do not climb much locking will not be a problem. If these numbers are high, see the sys_l_stat view for details.

select sum (waits), sum (wait_msecs) from sys_l_stat;

for the totals. The first number is the count of times a thread waited for a lock, the second is the sum of the real times spent waiting.

select top 10 * from sys_l_stat order by waits desc;

shows the keys on which most waits take place. See also the deadlocks and the wait_msecs columns. Numbers are cumulative.

For details on disk, see:

select top 10 * from sys_d_stat order by reads desc;

These views are explained in more detail in the performance meters section.

If there is a multi-user application doing random access for read or write, it is an idea to partition the data so that they do not hit the same page all the time. For example, to allow for parallel insert of orders without contention, the TPC C schema prefixes the order number with a warehouse and district number, so that simultaneous inserts will most often not hit the same page.

Even when there is no locking, there is still some serialization for database page access. This is shown in the various wait columns of SYS_L_STAT. If the sum of the waits for a key is over 1% of the touches for the key, as given in SYS_D_STAT, there is contention and a performance penalty of maybe 10%.

Such things can be improved by altering the schema design and configuration parameters do not usually help. If there is disk access, having more memory always helps because then locks will be in effect for less time.


6.1.6.4. Query Plans

To know if there is a bad query plan, see the explain output of the query.

Unless a full table scan is intended, a full table scan is pretty much always bad.

In the explain () result set, a table access is marked by a From table name. Below is mentioned the driving index, the conditions on the index and the conditions that are not indexable.

Joins are listed one after the other, the outermost first. The join order can be read from the explain output going from top to bottom.

If a table is read on a non primary key and columns not covered by the driving index are accessed, there is a second key mentioned with a full match of the primary key columns as condition.

A full table scan looks like this:

explain (select count (*) from t1');

...
from DB.DBA.T1 by STR2 8.4e+04 rows
Key STR2 ASC ()

 Local Code
  0: $30 "count" := artm $30 "count" + <constant (1)>
...

There are no conditions mentioned.

A lookup with index looks like this.

explain ('select fi2 from t1 where row_no = 11');
...
from DB.DBA.T1 by T1 Unique
Key T1 ASC ($26 ".FI2")
 inlined <col=1694 ROW_NO = <constant (11)>>
...

The condition is shown on the line below the key.

A lookup with full scan testing every row looks like this:

...
from DB.DBA.T1 by T1 2.5e+04 rows
Key T1 ASC ($26 ".FI2")

row specs: <col=1699 FI2 > <constant (11)>>
...

The condition is shown after the heading row specs. The whole key mentioned in the key will be read and the entries tested. if both indexed and row tests exist, the indexed are done first as one would think.

If your query has full table scans, consider adding an index.

If the index choice is not the right one, consider the following possibilities:

Hash joins can make full table scans. This is OK if the table scanned is small.

A hash join looks like this:

explain ('select count (*) from t1 a, t1 b where a.row_no = b.row_no + 1');
...

{
Fork
{
Fork
{
from DB.DBA.T1 by STR2 8.4e+04 rows
Key STR2 ASC ($26 "B.ROW_NO")

 Local Code
  0: $30 "temp" := artm $26 "B.ROW_NO" + <constant (1)>
  4: BReturn 0

Current of: <$28 "<DB.DBA.T1 B>" spec 5>
Sort (HASH) ($30 "temp") -> ($26 "B.ROW_NO")

}
from DB.DBA.T1 by STR2 8.4e+04 rows
Key STR2 ASC ($37 "A.ROW_NO")


Current of: <$39 "<DB.DBA.T1 A>" spec 5>
Hash source ($37 "A.ROW_NO") -> ($26 "B.ROW_NO")

After code:
  0: $44 "count" := artm $44 "count" + <constant (1)>
  4: BReturn 0
}
Select ($44 "count", <$39 "<DB.DBA.T1 A>" spec 5>, <$28 "<DB.DBA.T1 B>" spec 5>)
}

First t1 is read from start to end and a hash is filled with row_no + 1. Then t1 is read from start to end a second time. The hash source is the hash lookup. A hash join, if there is no index or if the whole table or a large part thereof is traversed is better than a loop join because it replaces random access with sequential. The complexity is close to O(n + n) instead of O (n * log n). Plus sequential read makes better use of read ahead.

The corresponding loop join looks like this:

explain ('select count (*) from t1 a, t1 b where a.row_no = b.row_no + 1 option (loop)');


{
Fork
{
from DB.DBA.T1 by STR2 8.4e+04 rows
Key STR2 ASC ($26 "B.ROW_NO")


Current of: <$28 "<DB.DBA.T1 B>" spec 5>

Precode:
  0: $29 "temp" := artm $26 "B.ROW_NO" + <constant (1)>
  4: BReturn 0
from DB.DBA.T1 by T1 Unique
Key T1 ASC ()
 inlined <col=1694 ROW_NO = $29 "temp">
 Local Code
  0: $35 "count" := artm $35 "count" + <constant (1)>
  4: BReturn 0

Current of: <$31 "<DB.DBA.T1 A>" spec 4>
}
Select ($35 "count", <$31 "<DB.DBA.T1 A>" spec 4>, <$28 "<DB.DBA.T1 B>" spec 5>)
}

To prevent hash joins, use table option or option at the end of the select as seen above. A hash join is very bad if a whole table is read for filling a hash and then only a small number of entries are fetched from the hash. However, if there is no index, then even this is better than a loop join.


6.1.6.5. Checkpoint Duration

A checkpoint can take a very long time in certain special conditions. Normally a log checkpoint is about a minute if flushing several G worth of buffers to disk. The flushing is mostly done online, after which there is an atomic time of a few seconds. After this operation resumes. Applications do not notice the checkpoint except as a temporary increase in response delays.

If spikes in response time are to be avoided, then making frequent checkpoints is better than making infrequent ones.

If there are long running transactions with many locks or uncommitted changes, then the checkpoint interval should anyhow be longer than several times the expected duration of such a transaction. If this is not so, the checkpoint can fall in the middle of the transaction and will internally have to rollback and again reestablish the uncommitted state so as to write a clean image on the new checkpoint. Doing this many times in the life of a transaction is very inefficient. In generally do not make long transactions with locking. Preferentially use read committed for anything long.

A checkpoint's atomic time can be prohibitively long under the following circumstances:



6.1.7. Performance Tuning

6.1.7.1. I/O

6.1.7.1.1. Optimizing Disk I/O

Virtuoso allows splitting a database over several files that may be on different devices. By allocating database fragments onto independent disks I/O performance in both random and sequential database operations can be greatly enhanced.

The basic unit of a database is the segment. A segment consists of an integer number of 8K pages. A segment may consist of one or more files called stripes. If a segment has multiple stripes these will be of identical length and the segment size will be an integer multiple of the stripe size.

The size limit on individual database files is platform dependent, but 64 bit file offsets are used where available. For large databases use of multiple disks and segments is recommended for reasons of parallelism even though a single database file can get very large. A database can in principle grow up to 32TB (32-bit page number with 8KB per page).

When a segment is striped each logically consecutive page resides in a different file, thus for a segment of 2 stripes the first stripe will contain all even numbered pages and the second all the odd numbered pages. The stripes of a segment should always be located on independent disks.

In serving multiple clients that do random access to tables the server can perform multiple disk operations in parallel, taking advantage of the independent devices. Striping guarantees a statistically uniform access frequency to all devices holding stripes of a segment.

The random access advantages of striping are available without any specific configuring besides that of the stripes themselves.


6.1.7.1.2. Configuring I/O queues

Striping is also useful for a single client doing long sequential read operations. The server can detect the serial nature of an operation, for example a count of all rows in a table and can intelligently prefetch rows.

If the table is located in a striped segment then the server will read all relevant disks in parallel if these disks are allocated to different I/O queues.

All stripes of different segments on one device should form an I/O queue. The idea is that the database stripes that benefit from being sequentially read form a separate queue. All queues are then read and written independently, each on a different thread. This means that a thread will be allocated per queue. If no queues are declared all database files, even if located on different disks share one queue.

A queue is declared in the striping section by specifying a stripe id after the path of the file, separated by an equal sign.

[Striping]
Segment1 = 200M, disk1/db-seg1-1.db = iq1, disk2/db-seg1-2.db = iq2
Segment2 = 200M, disk1/db-seg2-1.db = iq1, (disk2/db-seg2-2.db = iq2

In the above example the first stripes of the segments form one queue and the second stripes form another. This makes sense because now all database files on /disk1 are in iq1 and all on /disk2 are on iq2.

This configuration could have resulted from originally planning a 200 MB database split on 2 disks and subsequently expanding that by another 200 MB.

The I/O queue identifier can be an arbitrary string. As many background I/O threads will be made as there are distinct I/O queues.

Striping and using I/O queues can multiply sequential read rates by a factor almost equal to the number of independent disks. On the other hand assigning stripes on one disk to different queues can have a very detrimental effect. The rule is that all that is physically accessed sequentially will be on the same queue.



6.1.7.2. Schema Design Considerations

6.1.7.2.1. Data Organization

One should keep the following in mind when designing a schema for maximum efficiency.


6.1.7.2.2. Index Usage

A select from a table using a non-primary key will need to retrieve the main row if there are search criteria on columns appearing on the main row or output columns that have to be fetched from the main row. Operations are noticeably faster if they can be completed without fetching the main row if the driving key is a non-primary key. This is the case when search criteria and output columns are on the secondary key parts or primary key parts. Note that all secondary keys contain a copy of the primary key. For this purpose it may be useful to add trailing key parts to a secondary key. Indeed, a secondary key can hold all the columns of a row as trailing key parts. This slows insert and update but makes reference to the main row unnecessary when selecting using the secondary key.

A sequential read on the primary key is always fastest. A sequential search with few hits can be faster on a secondary key if the criterion can be evaluated without reference to the main row. This is because a short key can have more entries per page.


6.1.7.2.3. Space Consumption

Each column takes the space 'naturally' required by its value. No field lengths are preallocated. Space consumption for columns is the following:

Table: 6.1.7.2.3.1. Data type Space Consumption
Data Bytes
Integer below 128 1
Smallint 2
long 4
float 4
timestamp 10
double 8
string 2 + characters
NULL data length for fixed length column, as value of 0 length for variable length column.
BLOB 88 on row + n x 8K (see note below)

If a BLOB fits in the remaining free bytes on a row after non-LOBs are stored, it is stored inline and consumes only 3 bytes + BLOB length.

Each index entry has an overhead of 4 bytes. This applies to the primary key as well as any other keys. The length of the concatenation of the key parts is added to this. For the primary key the length of all columns are summed. For any other key the lengths of the key parts plus any primary key parts not on the secondary key are summed. The maximum length of a row is 4076 bytes.

In light of these points primary keys should generally be short.


6.1.7.2.4. Page Allocation

For data inserted in random order pages tend to be 3/4 full. For data inserted in ascending order pages will be about 90% full due to a different splitting point for a history of rising inserts.



6.1.7.3. Efficient Use of SQL - SQL Execution profiling

Virtuoso offers an execution profiling mechanism that keeps track of the relative time consumption and response times of different SQL statements.

Profiling can be turned on or off with the prof_enable function. When profiling is on, the real time between the start and end of each SQL statement execute call is logged on the server. When prof_enable is called for the second time the statistics gathered between the last call to prof_enable and this call are dumped to the virtprof.out file in the server's working directory.

Profiling is off by default. Profiling can be turned on with the statement:

prof_enable (1);

The virtprof.out file will be generated when prof_enable is called for the second time, e.g.

prof_enable (0);

will write the file and turn profiling back off.

Below is a sample profile output file:

Query Profile (msec)
Real 183685, client wait 2099294, avg conc 11.428772 n_execs 26148 avg exec  80

99 % under 1 s
99 % under 2 s
99 % under 5 s
100 % under 10 s
100 % under 30 s

23 stmts compiled 26 msec, 99 % prepared reused.

 %  total n-times n-errors
49 % 1041791  7952     53     new_order
28 % 602789   8374     490   delivery_1
12 % 259833   8203     296   payment
5  % 123162   821      35    slevel
2  % 54182    785      0     ostat (?, ?, ?, ?)
0  % 11614    4        0     checkpoint
0  % 2790     2        1     select no_d_id, count (*) from new_orde
0  % 2457     3        1     select count (*) from new_order
0  % 662      2        0     status ()
0  % 11       1        1     set autocommit on
0  % 3        1        0     select * from district

This file was produced by profiling the TPC C sample application for 3 minutes. The numbers have the following meanings:

The real time is the real time interval of the measurement, that is the space in time between the prof_enable call that started the profiling and the call that wrote the report. The client wait time is the time cumulatively spent inside the execute call server side, only calls completely processed between profiling start and end are counted. The average concurrency is the exec time divided by real time and indicates how many requests were on the average concurrently pending during the measurement interval. The count of executes and their average duration is also shown.

The next section shows the percentage of executes that finished under 1, 2, 5, 10 and 30 seconds of real time.

The compilation section indicates how many statements were compiled during the interval. These will be SQLPrepare calls or SQLExecDirect calls where the text of the statement was not previously known to the server. The real time spent compiling the statements is added up. The percentage of prepared statement reuses, that is, executes not involving compiling over all executes is shown. This is an important performance metric, considering that it is always better to use prepared statements with parameters than statements with parameters as literals.

The next section shows individual statements executed during the measurement interval sorted by descending cumulative real time between start and completion of the execute call.

The table shows the percentage of real time spent during the calls to the statement as a percentage of the total real time spent in all execute calls. Note that these real times can be higher than the measurement interval since real times on all threads are added up.

The second column shows the total execute real time, the third column shows the count of executes of the statement during the measurement interval. The fourth column is the count of executes that resulted in an error. The error count can be used for instance to spot statements that often produce deadlocks.

Statements are distinguished for profiling purposes using the 40 first characters of their text. Two distinct statements that do not differ in their first 40 characters will be considered the same for profiling. If a statement is a procedure call then only the name of the procedure will be considered, not possibly differing literal parameters.

The profiler will automatically write the report after having 10000 distinct statements in the measurement interval. This is done so as to have a maximum on the profiling memory consumption for applications that always compile a new statement with different literals, resulting in a potentially infinitely long list of statements each executed once. It is obvious that such a profile will not be very useful.

It cannot be overemphasized that if an application does any sort of repetitive processing then this should be done with prepared statements and parameters, for both reasons of performance and profilability.

Note:

Note that measurements are made with a one millisecond precision. Percentages are rounded to 2 digits. Timing of fast operations, under a few milliseconds, will be imprecise as a result of the 1 ms resolution. Also the cumulative compilation time may be substantially off, since the compilation may often take less than 1 ms at a time. Also note that the precision may also vary between platforms.


6.1.7.4. Meters & System Views

6.1.7.4.1. DB.DBA.SYS_K_STAT, DB.DBA.SYS_D_STAT, DB.DBA.SYS_L_STAT view

These views provide statistics on the database engine

create view SYS_K_STAT as
  select KEY_TABLE, name_part (KEY_NAME, 2) as index_name,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'n_landings') as landed,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'total_last_page_hits') as consec,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'page_end_inserts') as right_edge,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'page_end_inserts') as lock_esc
	from SYS_KEYS;
create view SYS_L_STAT as
  select KEY_TABLE, name_part (KEY_NAME, 2) as index_name,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'lock_set') as locks,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'lock_waits') as waits,
	(key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'lock_waits') * 100)
	  / (key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'lock_set') + 1) as wait_pct,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'deadlocks') as deadlocks,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'lock_escalations') as lock_esc
	from SYS_KEYS;
create view sys_d_stat as
  select KEY_TABLE, name_part (KEY_NAME, 2) as index_name,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'touches') as touches,
	key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'reads') as reads,
	(key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'reads') * 100)
	
 > / (key_stat (KEY_TABLE, name_part (KEY_NAME, 2), 'touches') + 1) as read_pct
	from SYS_KEYS;

These views offer detailed statistics on index access locality, lock contention and disk usage.

'reset' specified as the stat name will reset all counts for the key in question.


6.1.7.4.2. SYS_K_STAT - Key statistics

6.1.7.4.3. SYS_L_STAT

6.1.7.4.4. SYS_D_STAT
Examples:
select index_name, locks, waits, wait_pct, deadlocks
    from sys_l_stat order by 2 desc;

Get lock data, indices in descending order of lock count.

select index_name, touches, reads, read_pct
    from sys_d_stat order by 3 desc;

Get disk read counts, index with most reads first.

select index_name, (consec * 100) / (landed + 1)
    from sys_k_stat where landed > 1000  order by 2;

Get the percentage of consecutive page access on indices with over 1000 accesses so far, most randomly accessed first.


6.1.7.4.5. status SQL function - status ();

This function returns a summary of the database status as a result set. The result set has one varchar column, which has consecutive lines of text. The lines can be up to several hundred characters.

The contents of the status summary are described in the Administrator's Guide.


6.1.7.4.6. Virtuoso db file usage detailed info

All data in a virtuoso database are logically stored as database key rows. Thus the primary key for a table holds the entire row (including the dependent part) and the secondary keys just hold their respective key parts. So the space that the table occupies is the sum of the space occupied by it's primary key and all the secondary keys.

The main physical unit of allocation in a virtuoso db file is the database page (about 8k in virtuoso 3.x). So the server needs to map the key rows and outline blobs to database pages.

Virtuoso will store as many rows in a db page as it can, so usually one DB page will contain more than 1 row of a given key. No page contains rows from more than one key. However blobs (when not inlined on the row) will be placed in consecutive DB pages (up to their size). In addition to the blob and key pages the Virtuoso DB will hold a number of pages containing internal data. So the sum of the pages occupied by the key rows and the blobs is less then the amount of occupied pages (as reported by the status() BIF).

To provide detailed information about the space consumption of each key there's a system view:

DB.DBA.SYS_INDEX_SPACE_STATS
    ISS_KEY_TABLE       varchar -- name of the table
    ISS_KEY_NAME        varchar -- name of the key
    ISS_KEY_ID          integer -- id of the key (corresponding to KEY_ID from DB.DBA.SYS_KEYS)
    ISS_NROWS           integer -- number of rows in the table
    ISS_ROW_BYTES       integer -- sum of the byte lengths of all the rows in the table
    ISS_BLOB_PAGES      integer -- sum of the blob pages occupied by the outline blobs on all the rows of the table
    ISS_ROW_PAGES       integer -- sum of all the db pages containing rows of this key
    ISS_PAGES           integer -- = ISS_BLOB_PAGES + ISS_ROW_PAGES (for convenience).

Each select on that view causes the server to go over all the db pages in the db file (similarly to how the crash dump operates) and collect the statistics above. The pages are traversed 1 time per select, but still on large database files that may take some time.



6.1.7.5. Transaction Metrics, Diagnostics and Optimization

Bad design and implementation of transactions affects applications in the following ways:

The following rules should be observed when writing transactions:

6.1.7.5.1. Programming Virtuoso/PL

The isolation level is set in Virtuoso/PL with the

set isolation := level;

statement, where level is one of 'serializable', 'repeatable', 'committed', 'uncommitted'. Example :

set isolation = 'serializable';

The standard SQL syntax is also supported :

SET TRANSACTION ISOLATION LEVEL <isolation_level>
isolation level : READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE

Example :

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

The effect is for the rest of the procedure and any procedure called from this procedure. The effect stops when the procedure having executed the set isolation statement returns.


6.1.7.5.2. Sample Deadlock Handler

The text for a deadlock handler is

  declare retry_count int;
  retry_count := 0;
retry:
  {
    declare exit handler for sqlstate '40001' 
      {
        rollback work;
        ... 
        delay (rnd (2.5));  --- if 2.5 seconds is the expected duration of
the transaction.
        ...

        retry_count := retry_count + 1;
        if (retry_count > 5)
          signal ("xxxxx", "Too many deadlock retries in xxxxx.");
        goto retry;
      }
   -- do the operations.  The actual working code here.

    commit work;
  }

An exclusive read is done with

select s_quantity from stock where s_i_id = 111 for update;

6.1.7.5.3. ODBC

For the Virtuoso ODBC driver the isolation is set by :


6.1.7.5.4. JDBC

In the Virtuoso JDBC driver the isolation is set by the java.sql.Connection.setTransactionIsolation() JDBC API.

  conn.setTransactionIsolation (java.sql.Connection.TRANSACTION_SERIALIZABLE)

The constants are described in the Java Docs


6.1.7.5.5. .Net

In the VirtuosoClient.NET provider the isolation is set by the System.Data.IDbConnection.BeginTransaction Method (IsolationLevel) function.

  System.Data.IDBTransaction trx = conn.BeginTransaction (System.Data.IsolationLevel.ReadCommitted)

The constants are described here



6.1.7.6. Metrics and Diagnostics

Metrics are presented at the server and the table level.

The first metric to check is the output of status ('');

The paragraph titled transaction status contains the following:

The system view db.dba.sys_l_stat is used for locating bottlenecks.

The columns are:

All counts and times are cumulative from server startup.

The interpretation is as follows:

If deadlocks is high in relation to waits or locks, i.e. over 5%, there are many deadlocks and the transaction profiles may have to be adjusted. The table where deadlocks is incremented is the table where the deadlock was detected but the deadlock may involve any number of tables. So, if A and B are locked in the order A, B half of the time and B, a the rest of the time, then the deadlocks of the tables of A and B will be about the same, half of the deadlocks being detected when locking A, the other half when locking B.

If waits is high in relation to locks, for example 20%, then there is probably needless contention. Things are kept locked needlessly. Use read committed or make shorter transactions or lock items with the less contention first.

Because transaction duration varies, the place with the highest count of waits is not necessarily the place with the heaviest contention if the waits are short. Use wait_msecs in addition to waits for determining where the waiting takes place.

To get a general picture, use the Conductor's Statistics page or simply do

select top 5 * from sys_l_statt order by wait_msecs desc;

to get a quick view of where time is spent. You can also sort by waits desc or locks desc.

6.1.7.6.1. SQL Issues

It is possible to get bad locking behavior if the SQL compiler decides to make linear scans of tables or indices and the isolation is greater than read committed. The presence of a linear scan on an index with locking is often indicated by having a large number of lock escalations. If lock_esc is near locks then a large part of the activity is likely sequential reads.

The general remedy is to do any long report type transactions as read committed unless there are necessary reasons to do otherwise.

To see how a specific query is compiled, one can use the explain () function. To change how a query is compiled, one can use the table option or option SQL clauses.


6.1.7.6.2. Observation of Dynamics

Deadlocks and contention do not occur uniformly across time. The occurrences will sharply increase after a certain application dependent load threshold is exceeded.

Also, deadlocks will occur in batches. Several transactions will first wait for each other and then retry maybe several times, maybe only one succeeding at every round. In such worst cases, there will be many more deadlocks than successfully completed transactions. Optimize locking order and make the transactions smaller.

Looking at how counts change, specially if they change in bursts is useful.


6.1.7.6.3. Tracing and Debugging

The most detailed picture of a system's behavior, including deadlocks nd other exceptions cn be obtained with profiling. If the application is in C, Java or some other compiled language, one can use the language's test coverage facility to see execution counts for various branches of the code.

For client applications, using the Virtuoso trace() function can be useful for seeing which statements signal errors. Also the profiling report produced by prof_enable () can give useful hints on execution times and error frequencies. See Profiling and prof_enable ().

For PL applications, Virtuoso provides a profiler and test coverage facility. This is activated by setting PLDebug = 2 in the Parameters section of the ini file and starting the server. The functions cov_store () and cov_report are used for obtaining a report of the code execution profile thus far. See the documentation on "Branch Coverage" for this. The execution counts of the lines for exception handler code will show the frequency of exceptions. If in a linear block of code a specific line has an execution count lower than that of the line above it, then this means that the line with the higher count has signalled as many exceptions as the difference of the higher and lower count indicates.

The times indicated in the flat and call graph profile reports for PL procedures are totals across all threads and include time spent waiting for locks. Thus a procedure that consumes almost no CPU can appear high on the list if it waits for locks, specially if this is so on multiple threads concurrently. The times are real times measured on threads separately and can thus exceed any elapsed real tiime.

Single stepping is not generally useful for debugging locking since locking issues are timing sensitive.



6.1.7.7. Client Level Resource Accounting

Starting with version 6, Virtuoso keeps track of the count of basic database operations performed on behalf of each connected client. The resource use statistics are incremented as work on the connection proceeds. The db_activity () function can be called to return the accumulated operation counts and to optionally reset these.

The db_activity built-in function has one optional argument. The possible values are:

The human readable string is of the form:

   22.56MR rnd  1.102GR seq     10P disk  1.341GB /  102.7K messages
  

The postfixes K, M, G, T mean 10^3 to 10^15, except when applied to bytes, where these mean consecutive powers of 1024.

The numbers, left to right are the count of random row lookups, sequential row lookups, disk page reads, cluster inter-node traffic in bytes and cluster inter-node message count as an integer number of messages. If the configuration is a single server, the two last are 0.

The random and sequential lookup counts are incremented regardless of whether the row was found or not or whether it matched search conditions.

If the information is retrieved as an array, the array contains integer numbers representing these plus some more metrics.

Index - Meaning

If the thread calling db_activity is a web server thread, the totals are automatically reset when beginning the processing of the current web request.