×

☰ Table of Contents

TigerGraph Docs : TigerGraph System Administrators Guide v1.1

Version 1.1

Document updated

Copyright © 2015-2017 TigerGraph, Redwood City, California. All Rights Reserved.
For technical support on this topic, contact support@tigergraph.com with a subject line starting with "Sys Admin"


Table of Contents

Error rendering macro 'toc' : null


Hardware and Software Requirements

Last update:

Hardware Requirements

Actual hardware requirements will vary based on your data size, workload and features you choose to install.

Component Minimum Recommended
CPU 1.8 GHz (64-bit processor) or faster multi-core Dual-socket multi-core, 2.0 GHz (64-bit processors) or faster
Memory* 20 GB ≥ 64GB
Storage* 300 GB ≥ 1TB, RAID10 volumes for better I/O throughput.
SSD storage is recommended.
Network 1 Gigabit Ethernet adapter

10Gigabit Ethernet adapter for inter-node communication

*Actual needs depend on data size. Consult our solution architects for an estimate of memory and storage needs.

Comments:

  • The TigerGraph system is optimized to take advantage of multiple cores.
  • The TigerGraph graph data are stored in memory, so your machine's memory capacity should be large enough to store your graph.
  • The platform works excellently as a single node.  For high availability or scaling, a multi-node configuration is possible.

Certified Operating Systems

The TigerGraph Software Suite is built on 64-bit Linux. It can run on a variety of Linux 64-bit distributions. The software has been tested on the operating systems listed below.  When a range of versions is given, it has been tested on the two endpoints, oldest and newest. We continually evaluate the operating systems on the market and work to update our set of supported operating systems as needed. The TigerGraph installer will install its own copies of Java JDK and GCC , accessible only to the TigerGraph user account, to avoid interfering with any other applications on the same server.


On-Premises hosting Java JDK version GCC version (C/C++)
RedHat 6.8 (x64) Yes 1.8.0_141 4.8.2
RedHat 7.2 (x64) Yes 1.8.0_141 4.8.2
Centos 6.5 to 6.9 (x64) Yes 1.8.0_141 4.8.2
Centos 7.0 to 7.3 (x64) Yes 1.8.0_141 4.8.2
Ubuntu 14.04 LTS (x64) Yes 1.8.0_141 4.8.4
Ubuntu 16.04 LTS (x64) Yes 1.8.0_141 4.8.4
Debian 8 (jessie) Yes 1.8.0_141 4.8.4

Additionally, we offer Amazon Machine Images (AMI) to run on an Amazon EC2. Please contact us regarding recommended configurations.


Prerequisite Software

Utilities

Before offline installation, the TigerGraph system needs a few basic software packages to be present:

  1. tar, to extract files from the offline package;

  2. crontab, a basic OS software module which TigerGraph relies on;

  3. ip, to configure the network;

  4. ssh/sshd, to connect to the server;

  5. more, a tools to display the License Agreement
  6. netstat, a basic OS tool to check the network status

If they are not present, contact your system administrator to have them installed on your target system. They can be installed with following command.

[Centos or RedHat]: sudo yum install tar cronie iproute openssh-clients util-linux-ng net-tools [Ubuntu or Debian]: sudo apt-get install tar cron iproute openssh-client util-linux net-tools

NTP

If you are running TigerGraph on a multi-node cluster, you must install, configure and run the NTP (Network Time Protocol) daemon service.

Browser

In an on-premises installation, the system is fully functional without a web browser. To run the optional browser-based TigerGraph GraphStudio User Interface, you need the Google Chrome browser .





end of Hardware and Software Requirements

Back to Top



Version 1.1
Document Updated:


Platform Installation - Single Node Configuration

Contents of this Section:

Preparation

This guide assumes the following:

  1. You have a Linux server that meets the TigerGraph Hardware and Software Requirements .
  2. You have sudo or root privilege.
  3. You have a license key provided by TigerGraph.

Also, to use the TigerGraph platform effectively, you should have sufficient memory and disk space to store your graph data.

If you are updating from a previous version of the TigerGraph™ Platform, first read the section below on Update/Upgrade FAQs .


New Installation

Besides the basic requirements mentioned above (HW & SW requirements, sudo/root privilege, license key), there are two more requirements for installation:

  • As mentioned in the Hardware and Software Requirements , you need to pre-install a few basic Linux utilities:
    1. tar/gzip
    2. crontab
    3. ip
    4. ssh/sshd
    5. more
    6. netstat
  • Obtain the TigerGraph package from the TigerGraph presales/support team or download it from our download site :
    http://service.tigergraph.com/download/tigergraph-<version>-offline.tar.gz

    For example, if <version> is 1.0 , the file name is tigergraph-1.0-offline.tar.gz.

Now you are ready to begin installation.

  1. Extract files from the package:

    tar xzf tigergraph-<version>-offline.tar.gz
  2. A folder named offline-package will be created. Change into this folder. To install with default settings, just run the install.sh script.

    cd offline-package ./install.sh

The installer will ask you a few questions.

  • Do you agree to the License Terms and Conditions?
  • What is your license key?
  • Do you want to use the default Linux user name or select your own?
  • Do you want to use the default installation folder or select your own?

That's it! see Next Steps below.

If you want to customize the installation and understand better what will happen during installation, read the Installation Details section below.


Installation Details

The install.sh command will do the following:

    1. Install the system prerequisites.
    2. Ask you to accept the end user license agreement. Please read it carefully, and accept accordingly. If you do not accept, the install process will be terminated.
      This step cannot be bypassed.
    3. Prompt you to enter your license key (unless you provided it already on the command line). You cannot proceed to the next step until you enter a license key.
    4. Ask you if you want to change the default settings for user name and installation directory. The user account corresponds to a Linux account. This can either be an existing account, or the installer will create a new account. The default settings are user = tigergraph, password = tigergraph, and TigerGraph.root.dir = ${GSQL_USER_HOME}/tigergraph. Optionally, you can provide values in the command line, or you can instruct the installer to automatically accept the default settings.

    5. Configure the TigerGraph Platform.
    6. Install and deploy the TigerGraph Platform.

Configuration Options

The following default settings will be applied if no parameters are specified:

  • The installer will create a user called tigergraph , with password tigergraph .
  • The root directory for the installation (referred to as <TigerGraph.root.dir>) is a folder called tigergraph located in the tigergraph user's home directory, that is,
    /home/tigergraph/tigergraph/.

The installation can be customized by running command line options with the install.sh script :

Syntax: install.sh [-h][-u <user>] [-p <password>] [-r <tigergraph_root_dir>] [-l <license_key>] [-f] $ install.sh -h The default configuration is TigerGraph user: tigergraph, TigerGraph password: tigergraph, TigerGraph root dir: /home/tigergraph/tigergraph. -h -- show this message -l -- TigerGraph license key (If not specified, it will prompt the user for the license key.) -p -- TigerGraph password (If not specified, the default password is tigergraph.) -r -- TigerGraph root directory (If specified: <tigergraph_root_dir>/tigergraph. If not specified: ${GSQL_USER_HOME}/tigergraph) -u -- TigerGraph user (If not specified, the default user name is tigergraph.) -f -- suppress prompts, and continue installation despite warnings.


If the installation ran successfully, see Next Steps below.



Next Steps

After the installation finishes, the installer will automatically change users to the new TigerGraph user you just created .

If you installed with the default password, we recommend that you change it now.


To confirm correct operation:

  • T ry the command gadmin status

    If the system installed correctly, the command should report that zk , kafka , dict, nginx, gsql, and Visualization are up a nd ready.
    However, since there is no graph data loaded yet, gse , gpe , and restpp are not initialized.

  • Try the command gsql --version


You are now finished!  If you are a first-time user:

  • Try some demonstrations: Go to the <TigerGraph_Root_Dir> / document/DEMO folder and run ./RUN_DEMO.sh
  • Start learning, with one of our tutorials, such as GSQL Tutorial and Demo Examples


If you wish to perform additional configuration, see the appropriate sections of the TigerGraph System Administrators Guide v1.1 .


Updating a System

Update/Upgrade FAQs


Q: Is the new version backward-compatible with previous versions?

A: If you are updating from v0.8.x, please contact support@tigergraph.com for the Release Notes for TigerGraph Platform v1.1 for a detailed review of changes between v1.1 and previous releases. If you are currently running v0.8 or v0.8.1, the majority of language and API features are backward-compatible. A few features have changed, so consult the Release Notes to see if these affect you.

After 1.0, TigerGraph plans to use the following version numbering scheme:
Major.Minor.Patch (Patch number may be omitted if its value is 0.)

  • Major is incremented when there are significant changes which are not backward compatible.
  • Minor is incremented when there are new features but they are backward compatible.
  • Patch is incremented when there are only bug fixes which are backward compatible.

If you are updating from a version older than 0.8, please contact TigerGraph Support to review your individual situation.


Q: Can I use my previous license key?

A: TigerGraph v1.1 is introducing a new license key system.  Previous keys will contine to work until they expire; all new keys will use the new format.

Q: Do you support live updates?

A. The current version does not support live update. The TigerGraph services needs to be turned off during an update.

Q: Will my graph data store be saved?

A: Yes, it is possible to keep the graph data store from v0.8 or later and use it in a v1.x system. For minor updates, no special steps are required. For major updates (e.g., from v0.8 to v1.1), a migration tool should be used.

Q: Will my catalog (definitions of graph schema, loading jobs and queries) be saved?

A: For minor or patch updates, the catalog will be preserved, but the queries will need to be reinstalled. See details in Update Procedure below.

Q: I have other files, not part of the original TigerGraph installation (e.g., command scripts, input data files, log files) stored in the file structure. Will they be saved?

A: Your files will be saved, but they might be moved. The installation program will create a backup folder to save the old versions of files before installing the new versions. In particular, the entire folder <TigerGraph.Root.Dir>/dev will be copied to a new folder called <TigerGraph.Root.Dir>/dev_<datetime>, before creating a new /dev folder. The files include but are not limited to:

Update Procedure for Minor Updates

These instructions are only for updates that maintain 100% backward compatibilty. Please refer to the Release Notes for your intended new version to see whether your proposed update is eligible to follow this procedure or not.

  1. Back up your files. (Optional but recommended if this is your first time performing an update.)  Any files you wish to save (command files, input data files, personal files, log files, etc.) should be stored outside of the TigerGraph root folder.
  2. Allow any graph operations to finish. Then follow the procedure at the beginning of this document for installing a new system.  The installer will automatically shut down your system and start it again.

    Be sure to specify the same username as your current installation. Otherwise, if you use a different user name, it will be treated as a new installation, with an empty graph.

  3. Verify: Run the command gsql to start the GSQL shell. The first time after an update, gsql performs two important operations:

    1. Copies your catalog from your old installation to the new installation .

    2. Compares the files in the backup /dev_<datetime>/gdk/gsql/src folder to the new /dev/gdk/gsql/src folder. Pay attention to any files residing in the old folder but not in the new folder.  Review them and copy them to the new folder if appropriate.  See the example below.

  4. List the jobs in the catalog using the ls command. Check to see if all your queries reinstalled successfully. Even in a minor update, there may be new features or bug fixes that provide a better format for your queries. If you decide to make changes to your queries, reinstall them.

    Export a copy of a query
    $ gsql 'SHOW QUERY qname > q1.gsql' # save a copy of your query # Modify your query if needed or wanted

    If you decide to change the query:

    Install a new version of a query
    $ gsql 'DROP QUERY q1' $ gsql q1.gsql $ gsql install query q1

Example: Comparing Old and New Installations

Diagnostic output of 'gsql' after a software update.

[RUN ] cp -r /home/tigergraph/tigergraph/dev_2016.08.25-10.27.48/gdk/gsql/.catalog /home/tigergraph/tigergraph/dev/gdk/gsql/

[*.?pp] oldest: /home/tigergraph/tigergraph/dev_2016.08.25-10.27.48/gdk/gsql/src/ReducerLib/ReducerLib.cpp 2016-04-27 13:51:10
[*.?pp] newest: /home/tigergraph/tigergraph/dev_2016.08.25-10.27.48/gdk/gsql/src/TokenBank/JOB_load_videoE.cpp 2016-08-25 10:00:25.221119

!!! Found /home/ tigergraph / tigergraph /dev_2016.08.25-10.27.48/gdk/gsql/src/ has user defined token functions
Please merge to /home/tigergraph/tigergraph/dev/gdk/gsql/src/

[RUN ] touch /home/tigergraph/tigergraph/dev/gdk/gsql/.hasDDL
Welcome to GSQL Shell version: master

Type 'help' for help.

GSQL >

The first [ RUN ] line tells us that the catalog is being copied to the new installation.

Next, the gsql version checker compares *.pp (.cpp and .hpp) files in the new and backup dev/gdk/gsql/src folders.  While the majority of code for the GSQL platform is precompiled object code, a small portion is in source code format. This source code relates to (1) queries and loader jobs that the user wrote, (2) User-Defined Functions (UDFs) that the user wrote, or (3) a few standard functions that TigerGraph implemented in the same style as UDFs.

The red warning indicates that the gsql version checker found a discrepancy. It advises the user to merge /home/tigergraph/tigergraph/dev_2016.08.25-10.27.48/gdk/gsql/src/ to /home/tigergraph/tigergraph/dev/gdk/gsql/src/.

To investigate, perform a recursive diff between these two folders:

Diff between backup and new folders
$ diff -r dev_2016.08.25-10.27.48/gdk/gsql/src dev/gdk/gsql/src diff -r dev_2016.08.25-10.27.48/gdk/gsql/src/QueryUdf/ExprFunctions.hpp dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp 29a30 > #include <gle/engine/cpplib/headers.hpp> Only in dev/gdk/gsql/src: src Only in dev_2016.08.25-10.27.48/gdk/gsql/src/TokenBank: JOB_load_videoE.cpp Only in dev_2016.08.25-10.27.48/gdk/gsql/src/TokenBank: JOB_load_videoV.cpp

This example tells us the following:

  • There is some content that is only in the new folder. This clearly represents intentional updates by TigerGraph. No action is needed.
    • The new version of ExprFunctions.hpp contains an additional #include line.
    • The new folder dev/gdk/gsql/src contains a file or folder called src.
  • The backup folder dev_2016.08.25-10.27.48/gdk/gsql/src/TokenBank contains two files not found in the corresponding new folder: JOB_Load_videoE.cpp and JOB_Load_videoV.cpp.  load_videoE and load_videoV are two loading jobs from one of our training examples, which were in the catalog when the update installation was performed.  If the user wants to retain these loading jobs, the files should be copied from the backup folder to the corresponding new folder.


end of Platform Installation
Back to Top



Activating a System-Specific License

Updated:

This guide provides step-to-step instructions for activating or renewing a TigerGraph license, by generating and installing a license key unique to that TigerGraph system. This document applies to both non-distributed and distributed systems. In this document, a cluster acting cooperatively as one TigerGraph database is considered one system.

A valid license key activates the TigerGraph system for normal operation. A license key has a built-in expiration date and is valid on only one system. Some license keys may apply other restrictions, depending on your contract. Without a valid license key, a TigerGraph system can perform certain administration functions, but database operations will not work.

To activate a new license, a user first configures their TigerGraph system. The user then collects the fingerprint of theTigerGraph system (so-called license seed) using a TigerGraph-provided utility program. Then the collected materials are sent to TigerGraph or an authorized agent via email or web form. TigerGraph certifies the license based on the collected materials and sends a license key back to the user. The user then installs the license key on their system using another TigerGraph command. A new license key (e.g., one with a later expiration) can be installed on a live system that already has a valid license; the installation process does not disrupt database operations.

Step-by-Step Guide

Note: Before beginning the license activation process, the TigerGraph package must be installed on each server, and the TigerGraph system must be configured with gadmin.

  1. Collect the fingerprint of the whole TigerGraph system using the command tg_ lic_seed , which can be executed on any machine in the system. The command tg_lic_seed packs all the collected data into a local file (named tigergraph_seed). When tg_lic_seed has completed successfully, it outputs the path of the collected data to the console.

    Collect Fingerprint of TigerGraph System

    $ tg_lic_seed

    seed file is ready at /home/tigergraph/tigergraph/tigergraph_seed
  2. Send the tigergraph_seed file to TigerGraph , either through our license activation web portal (preferred) or by email to license@tigergraph.com. If using email, please include the following information:
    1. Company/Organization name
    2. Contract number . If you do not know you contract number, please contact your sales representative or sales@tigergraph.com.
  3. If the contract and license seed are in good order, a new license key file will be certificated and sent back to you.
  4. Copy the license key file to a directory on the TigerGraph system where the TigerGraph linux user has r ead permission .
  5. To install the license key, run command tg_ lic_install , specifying the path to the license key file.

    Install License

    $ tg_lic_install
    Usage: tg_lic_install <license_path>

    If installation is completed successfully, the message "install license successfully" will be displayed in the console. Otherwise, another message "failed to install license" will be displayed.

Checking License Information

After a license key has been installed successfully on a TigerGraph system, the information of the installed license is available via the following REST API:

Get License Information

$ curl -X GET "localhost:9000/showlicenseinfo"
  {  
    "message": "",
    "error": false,
    "version": {
      "schema": 0,
      "api": "v2",
    }
    "code": "",
    "results": [
      {
        "Days remaining": 10160,
        "Expiration date": "Mon Oct  2 04:00:00 2045\n"
      }
    ]
  }

end of System-Specific License Activiation

Back to Top



Managing TigerGraph Servers with gadmin

Contents of this Section:

Introduction

TigerGraph Graph Administrator (gadmin) is a tool for managing TigerGraph servers. It has a self-contained help function and a man page, whose output is shown below for reference. If you are unfamiliar with the TigerGraph servers, please see GET STARTED with TigerGraph .

To see a listing of all the options or commands available for gadmin, run any of the following commands:

$ gadmin -h $ man gadmin $ info gadmin


After changing a configuration setting, it is generally necessary to run gadmin config-apply. Some commands invoke config-apply automatically. If you are not certain, just run config-apply

Command Listing

Below is the man page for gadmin. Most of the commands are self-explanatory.

GADMIN(1) User Commands GADMIN(1) NAME gadmin - manual page for TigerGraph Administrator. SYNOPSIS gadmin [options] COMMAND [parameters] DESCRIPTION Version 1.0, Sept, 19, 2017 gadmin is a tool for managing TigerGraph servers OPTIONS -h, --help show this help message and exit --configure invoke interactive (re)configuration tool. Options: single_dir:/xxx/yyy(deploy directory will be /xxx/yyy), or a keyword(e.g., 'gadmin --configure port', will configure any entry whose name has string 'port') --set set one configuration --dump-config dump current configuration after parsing config files and command line options and exit --dry-run show what operation will be performed but don't actually do it -p SSH_PASSWORD, --password=SSH_PASSWORD the password to ssh to other nodes -y, --yes silently answer Yes to all prompts -v, --verbose enable verbose output --version show gadmin version and exit -f, --force execute without performing checks --wait wait for the last command to finish (e.g., snapshot) Commands: Server status gadmin status [gpe gse restpp dict,...] IUM status gadmin ium_status Disk space of devices gadmin ds [path] Mount info of a path gadmin mount {path} Memory usage of TigerGraph components gadmin mem [gse gpe restpp dict,...] CPU usage of TigerGraph components gadmin cpu [gse gpe restpp dict,...] Check TigerGraph system prerequisites and resources gadmin check Show log of gpe, gse, restpp and issued fab commands gadmin log [gse gpe restpp dict fab,...] Get various information about gpe, gse and restpp gadmin info [gse gpe restpp dict,...] Software version(s) of TigerGraph components gadmin version [gse gpe restpp dict,...] Stop specified or all services gadmin stop [gse gpe restpp dict,...] Restart specified or all services gadmin restart [gse gpe restpp dict,...] Start specified or all services gadmin start [gse gpe restpp dict,...] Start the RESTPP loaders gadmin start_restpp_loaders Start the KAFKA loaders gadmin start_kafka_loaders Stop the RESTPP loaders gadmin stop_restpp_loaders Stop the KAFKA loaders gadmin stop_kafka_loaders Dump partial or full graph to a directory gadmin dump_graph {gse, gpe [*, segment], all}, dir, separator Snapshot gpe and gse gadmin snapshot Reset the kafka queues gadmin reset Show the available packages gadmin pkg-info Install new package to TigerGraph system gadmin pkg-install Update gpe, gse, restpp, dict, etc. without configuration change gadmin pkg-update Remove available packages or binaries from package pool gadmin pkg-rm [files] Apply new configure. Note some modules may need to restart gadmin config-apply [gse gpe restpp dict kafka zk] Set a new license key gadmin set-license-key license key string Update the new graph schema gadmin update_graph_config Update components under a directory gadmin update Setup sync of all gstore data in mutiple machines gadmin setup_gstore_sync Setup rate control of RESTPP loader gadmin setup_restpploader_rate_ctl Restart sync of all gstore data in mutiple machines gadmin gstore_sync_restart Stop sync of all gstore data in mutiple machines gadmin gstore_sync_stop For more information, updates and news, visit gadmin website: http://www.tigergraph.com SEE ALSO The full documentation for gadmin is maintained as a Texinfo manual. If the info and gadmin programs are properly installed at your site, the command info gadmin should give you access to the complete manual. TigerGraph Administrator. Sept 2017 GADMIN(1)

Examples

Checking the status of TigerGraph component servers:

Use "gadmin status" to report whether each of the main component servers is running (up) or stopped (off).  The example below shows the normal status when the graph store is empty and a graph schema has not been defined:

$ gadmin status

=== zk ===
[SUMMARY][ZK] process is up
[SUMMARY][ZK] /home/tigergraph/tigergraph/zk is ready
=== kafka ===
[SUMMARY][KAFKA] process is up
[SUMMARY][KAFKA] queue is ready
=== gse ===
[SUMMARY][GSE] process is down
[SUMMARY][GSE] id service has NOT been initialized
=== dict ===
[SUMMARY][DICT] process is up
[SUMMARY][DICT] dict server is ready
=== graph ===
[SUMMARY][GRAPH] graph has NOT been initialized
=== restpp ===
[SUMMARY][RESTPP] process is down
[SUMMARY][RESTPP] restpp has NOT been initialized
=== gpe ===
[SUMMARY][GPE] process is down
[SUMMARY][GPE] graph has NOT been initialized
=== glive ===
[SUMMARY][GLIVE] process is up
[SUMMARY][GLIVE] glive is ready
=== Visualization ===
[SUMMARY][VIS] process is up (WebServer:2254; DataBase:2255)
[SUMMARY][VIS] Web server is working


Stopping a particular server, such as the rest server (name is “restpp"):

$ gadmin stop restpp


Changing the retention size of queue to 10GB:

$ gadmin --set -f online.queue.retention_size 10

Updating the TigerGraph License Key

A TigerGraph license key is initially set up during the installation process. If you have obtained a new license key,  run the command

gadmin set-license-key <new_key>


to install your new key. You should then follow this with

gadmin config-apply



Example: Setting the license key

$ g admin set-license-key new_license_key

[RUN ] /home/tigergraph/.gsql/gpe_auto_start_add2cron.sh
[RUN ] /home/tigergraph/.gsql/all_log_cleanup_add2cron.sh
[RUN ] rm -rf /home/tigergraph/tigergraph_coredump
[RUN ] mkdir -p /home/tigergraph/tigergraph/logs/coredump
[RUN ] ln -s /home/tigergraph/tigergraph/logs/coredump /home/tigergraph/tigergraph_coredump

$ gadmin config-apply
[FAB ][2017-03-31 15:03:05] check_config

[FAB ][2017-03-31 15:03:06] update_config_all
Local config modification Found, will restart dict server and update configures.
[FAB ][2017-03-31 15:03:11] launch_zookeepers

[FAB ][2017-03-31 15:03:21] launch_gsql_subsystems:DICT
[FAB ][2017-03-31 15:03:22] gsql_mon_alert_on
Local config modification sync to dictionary successfully!

$



end of Managing TigerGraph Servers with gadmin

Back to Top



Backup and Restore

Updated:

Introduction and Syntax

GBAR (Graph Backup And Restore), is an integrated tool for backing up and restoring the data and data dictionary (schema, loading jobs, and queries) of a single TigerGraph node. In Backup mode, it packs TigerGraph data and configuration information in a single file onto disk or a remote AWS S3 bucket. Multiple backup files can be archived. Later, you can use the Restore mode to rollback the system to any backup point. This tool can also be integrated easily with Linux cron to perform periodic backup jobs.

The current version of GBAR is intended for restoring the same machine that was backed up. For help with cloning a database (i.e., backing up machine A and restoring the database to machine B), please contact support@tigergraph.com .



Synopsis
Usage: gbar backup [options] -t <backup_tag> gbar restore [options] <backup_tag> gbar config gbar list Options: -h, --help Show this help message and exit -v Run with debug info dumped -vv Run with verbose debug info dumped -y Run without prompt -t BACKUP_TAG Tag for backup file, required on backup


The -y option forces GBAR to skip interactive prompt questions by selecting the default answer. There are currently five situations for prompts:

  • During backup, if GBAR calculates there is insufficient disk space to copy and then compress the graph data, it will ask: Do you want to continue?(y/N). The default answer is no.
  • At the start of restore, GBAR will always asks if it is okay to stop and reset the TigerGraph services: (y/N)? The default answer is yes.
  • During restore, if user does not provide the backup_tag with a full backup file name in command line, and there are multiple files matching that tag, it by default choose the latest, and will ask: Do you want to continue?(y/N) The default answer is yes.
  • During restore, if GBAR calculates there is insufficient disk space to copy the current graph data and then uncompress the archived data, it will ask: Do you want to continue?(y/N). The default answer is no.
  • After restore, old gstore data will be left on disk. GBAR needs your confirmation to remove it, and will ask: Do you want to continue removing it?(y/N). The default answer is no.

Config

gbar config

GBAR Config must be run before using GBAR backup/restore functionality. GBAR Config will open the following configuration template interactively in a text editor. Using the comments as a guide, edit the configuration file to set the configuration parameters according to your own needs.

Synopsis
# Configure file for GBAR # you can specify storage method as either local or s3, or both # Assign True if you want to store backup files on local disk # Assign False otherwise, in this case no need to set path store_local: False path: PATH_TO_BACKUP_REPOSITORY # Assign True if you want to store backup files on AWS S3 # Assign False otherwise, in this case no need to set AWS key and bucket store_s3: False aws_access_key_id: YOUR_ACCESS_KEY aws_secret_access_key: YOUR_SECRET_KEY bucket: YOUR_BUCKET_NAME # The maximum timeout value to wait for core modules(GPE/GSE) on backup. # As a roughly estimated number, # GPE & GSE backup throughoutput is about 2GB in one minute on HDD. # You can set this value according to your gstore size. # Interval string could be with format 1h2m3s, means 1 hour 2 minutes 3 seconds, # or 200m means 200 minutes. # You can set to 0 for endless waiting. backup_core_timeout: 5h

Backup

gbar backup -t <backup_tag>

The backup_tag acts like a filename prefix for the archive filename. The full name of the backup archive will be <backup_tag>-<timestamp>.tgz.

GBAR Backup performs a live backup, meaning that normal operations may continue while backup is in progress. When GBAR backup starts, it sends a request to GADMIN, which then requests the GPE and GSE to create snapshots of their data. Per the request, the GPE and GSE store their data under GBAR’s own working directory. GBAR also directly contacts the Dictionary and obtains a dump of its system configuration information. Besides, GBAR records TigerGraph system version. Then, GBAR compresses all the data and configuration information into a single file named <backup_tag>-<timestamp>.tgz. As the last step, GBAR copies that file to local storage or AWS S3, according to the Config settings, and removes all temporary files generated during backup.

The current version of GBAR Backup takes snapshots quickly to make it very likely that all the components (GPE, GSE, and Dictionary) are in a consistent state, but it does not fully guarantee consistency. It’s highly recommended when issuing the backup command, no active data update is in progress. A no-write time period of about 5 seconds is sufficient.

Backup does not save input message queues for REST++ or Kafka.

Restore

gbar restore <backup_tag>

Restore is an offline operation, requiring the data services to be temporarily shut down. The backup tag acts as a filename prefix. During restore, the user can provide either the tag (filename prefix) or the full filename with timestamp information in the name. When GBAR restore begins, it first searches for a backup file matching the backup_tag supplied in the command line. If multiple matching backup archives are found , GBAR will select the most recent one, and ask the user for confirmation to continue. Then it decompresses the backup file to a working directory. As the next step, GBAR will compare the Tigergraph system version in the backup archive with the current system's version, to make sure that backup archive is compatible with that current system. It will then shut down the TigerGraph servers (GSE, RESTPP, etc.) temporarily. Then, GBAR makes a copy of the current graph data, as a precaution. If GBAR estimates that there is not sufficient disk space for the copy, GBAR will display a warning and prompt the user to abort (unless the user has overridden the prompt with the -y option). Next, GBAR copies the backup graph data into the GPE and GSE and notifies the Dictionary to load the configuration data. When these actions are all done, GBAR will restart the TigerGraph servers.

The primary purpose of GBAR is to save snapshots of the data configuraton of a TigerGraph system, so that in the future the same system can be rolled back (restored) to one of the saved states.  A key assumption is that Backup and Restore are performed on the same machine, and that the file structure of the TigerGraph software has not changed. Specific requirements are listed below.

Restore Requirements and Limitations


Restore is supported if the TigerGraph system has had only minor version updates since the backup.

  • TigerGraph version numbers have the format X.Y[.Z], where X is the major version number and Y is the minor version number.
  • Restore is supported if the backup archive and the current system have the same major version number AND the current system has a minor version number that is greater than or equal to the backup archive minor version number.
  • Backup archives from a 0.8.x system cannot be Restored to a 1.x system.
  • Examples:

    Backup archive's system version current system version Restore is allowed?
    0.8 1.0 NO - Major versions differ
    1.1 1.1 YES - Major and minor versions are the same
    1.1 1.2 YES - Major versions are the same; current minor version > archived minor version
    1.1 1.0 NO - Major versions are the same; current minor version < archived minor version


Restore needs enough free space to accommodate both the old gstore and the gstore to be restored.

After restore, old gstore data will be left on disk by default. To remove the old data, either answer "Y" when Restore asks you, or remove it yourself after restore has completed and the system is running again.

List Backup Files

gbar list

This command lists all generated backup files in the storage place configured by the user. For each file, it shows the file’s full tag, file’s size in human readable format, and its creation time.

GBAR Detailed Example

The following example describes a real example, to show the actual commands, the expected output, and the amount of time and disk space used, for a given set of graph data. For this example, and Amazon EC2 instance was used, with the follwing specifications:

single instance with 32 CPU + 244GB memory + 2TB HDD.

Naturally, backup and restore time will vary depending on the hardware used.

GBAR Backup Operational Details

The flowchart below shows how GBAR processes a backup request.

To run a daily backup, we tell GBAR to backup with the tag name daily .

$ gbar backup -t daily
[SUMMARY] Retrieve TigerGraph system configuration...
[SUMMARY] Check TigerGraph system status...
[SUMMARY] Get TigerGraph version as 1.0
[SUMMARY] Issued snapshot command to GPE/GSE
[SUMMARY] Wait for GPE/GSE snapshot done...
[SUMMARY] GPE/GSE snapshot done in 37m11s
[SUMMARY] Backup DICT...
[SUMMARY] Compress backup data to daily-20171206031441.tgz...
[SUMMARY] Compress data done in 39m38s
[SUMMARY] Clean intermediate files...
[SUMMARY] Backup file daily-20171206031441.tgz size 64.4GB
[SUMMARY] Copy daily-20171206031441.tgz to local storage /home/tigergraph/backups...
[SUMMARY] Copy finished in 10m31s
Backup done in 1h37m43s.

The total backup process took about 1 hour and a half, and the generated archive is about 64GB. Dumping the dump GPE/GSE data to disk took 37 minutes. Compressing the files to a single portable backup archive took another 40 minutes.

GBAR Restore Operational Details

This flowchart shows GBAR runs a restore job.



To restore from a backup archive, tell GBAR the backup tag ( daily ).GBAR will choose the latest one by default. To select a specific archive to restore, provide GBAR with a full archive name, such as daily-20171206031441 . By default, restore will ask the user to approve at least two actions. If you want to pre-approve these actions, use the "-y" option. GBAR will make the default choice for you.

$ gbar restore daily
[SUMMARY] Retrieve TigerGraph system configuration...
GBAR restore needs to reset TigerGraph system.
Do you want to continue?(y/N):y
[SUMMARY] Multiple backup points found for tag daily, will pick up the latest.
Will restore from the latest one daily-20171206031441.
Do you want to continue?(y/N):y
[SUMMARY] Restore to latest one daily-20171206031441
[SUMMARY] Backup file daily-20171206031441.tgz size 64.4GB
[SUMMARY] Copy daily-20171206031441.tgz to GBAR work dir...
[SUMMARY] Copy finished in 4m13s
[SUMMARY] Decompress daily-20171206031441.tgz with size 64.4GB...
[SUMMARY] Decompress done in 13m23s
[SUMMARY] Backup data with version 1.0 applicable to 1.0 system.
[SUMMARY] Stop TigerGraph system...
[SUMMARY] Start GDICT...
[SUMMARY] Move aside old GPE data...
[SUMMARY] Move aside old GSE data...
[SUMMARY] Snapshot old GDICT data...
[SUMMARY] Restore GPE data...
[SUMMARY] Restore IDS data...
[SUMMARY] Restore DICT...
[SUMMARY] Reset TigerGraph system...
[SUMMARY] Start TigerGraph system...
[SUMMARY] Reinstall all GSQL query...
[SUMMARY] Recompile all loading job...
[SUMMARY] Running post restore jobs...
Restore done in 21m33s.
GPE/GSE old data still saved on disk, you can remove it after TigerGraph system stable, or remove it right now.
Do you want to continue removing it?(y/N):n
GPE/GSE old data saved as /home/tigergraph/tigergraph/gstore/0/part-20171206032413 and /home/tigergraph/tigergraph/gstore/0/<part_id>/ids-20171206032413, you need to remove them manually.

For our test,  GBAR spent about 20 minutes to finish the restore job. Most of the time (13 minutes) was spent decompressing the backup archive.

Note that after the restore is done, GBAR prompts you to make a choice whether to remove old data. Here we choose no, in that case, we will need to remove the old gstore files manually after, say, we verify that the restored system is functioning correctly.

Performance Summary

GStore size Backup file size Backup time Restore time
278GB 64GB 1.5 hour 20 mins


end of Backup and Restore
Back to Top



Using Encrypted SSL Connections


TigerGraph supports secure data-in-flight communication, using SSL/TLS encryption protocol. This applies to any outward-facing channel, including GSQL clients, RESTPP endpoints, and the GraphStudio web interface. When SSL/TLS is enabled, HTTPS takes the place of HTTP for RESTPP and GraphStudio connections.

Prerequisites

You should have basic knowledge about how SSL works:

  1. What the SSL certificate and key are used for
  2. That a SSL certificate is bound to a domain
  3. How a SSL certificate chain works

A good primer on SSL is available to https://httpd.apache.org/docs/2.4/ssl/ssl_intro.html

Nginx-Based

TigerGraph uses the Nginx web server, so SSL configuration makes use of some built-in support in Nginx.

http://nginx.org/en/docs/http/configuring_https_servers.html

Step 1. Obtain a SSL Certificate

The two main options for obtaining a SSL Certificate are to generate your own self-signed certificate or to purchase a certificate from a trusted Certificate Authority. Regardless of which method you choose, your certificate should be chained to a trusted root certificate embedded in your browser. The options and details for producing a trusted SSL certificate are beyond the scope of this document. The focus of this document is how to use a configure your TigerGraph system to use the certificate to enable SSL.

Option 1: Using a Certificate From A Trusted Agent

First, obtain a SSL certificate from a trusted agent of your choice. Certificate vendors will provide clear instructions for ordering a certificate and then for installing it on your system.

Then you can configure the certificate with gadmin --configure ssl

Option 2: Create a Self-Signed Certificate

There are multiple ways to create a self-signed certificate.  One example is shown below.

For simplicity, the method below will use the root certificate directly as the HTTPS server certificate.  This method is satisfactory for testing but should not be used for a production system.


In the example below, the Common Name value should be your server hostname, since HTTPS certificates are bound to domain names.


Self-Signed Certificate generation example using openssl
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout ~/nginx-selfsigned.key -out ~/nginx-selfsigned.crt Generating a 2048 bit RSA private key .................................................................................................................................+++ ........+++ writing new private key to '/home/tigergraph/nginx-selfsigned.key' ----- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]:US State or Province Name (full name) [Some-State]:California Locality Name (eg, city) []:Redwood City Organization Name (eg, company) [Internet Widgits Pty Ltd]:TigerGraph Inc. Organizational Unit Name (eg, section) []:GLE Common Name (e.g. server FQDN or YOUR name) []: my.ip.addr.num Email Address []:engineer@tigergraph.com

Change the Certificate Permission

For security reasons, the certifiactes can only be used with permission 600 or less .

$ chmod 600 ~/nginx-selfsigned.*


Step 2: Configure SSL with gadmin

With the self-signed certificate successfully generated, you can configure it with gadmin, so that all the HTTP traffic will be protected with SSL.

$ gadmin --configure ssl Enter new values or accept defaults in brackets with Enter. Enable SSL with all HTTP responses (SSL Cert required): default False Nginx.SSL.Enable [False]: True True Path to SSL cert bundle (domain cert, intermediate cert and root cert) Nginx.SSL.Cert []: /home/tigergraph/nginx-selfsigned.crt /home/tigergraph/nginx-selfsigned.crt Path to SSL key Nginx.SSL.Key []: /home/tigergraph/nginx-selfsigned.key /home/tigergraph/nginx-selfsigned.key ... Test servers with supplied settings? [Y/n] Y ... Success. All settings are valid Save settings? [y/N] y


After saving the settings, apply the configuration settings.

$ gadmin config-apply [FAB ][2017-12-12 18:48:16] check_config [FAB ][2017-12-12 18:48:16] update_config_all Local config modification Found, will restart dict server and update configures. [FAB ][2017-12-12 18:48:21] launch_zookeepers [FAB ][2017-12-12 18:48:31] gsql_mon_alert_on [FAB ][2017-12-12 18:48:31] launch_zookeepers [FAB ][2017-12-12 18:48:42] launch_gsql_subsystems:DICT [FAB ][2017-12-12 18:48:42] gsql_mon_alert_on Local config modification sync to dictionary successfully!


Then restart the nginx service.

$ gadmin restart nginx -y


Testing Your SSL Connection

Now you may test the connection.

A direct curl request to the server will fail due to certificate verification failure:

$ curl https://localhost:44240 curl: (60) server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option.


You may use the -k option to turn off the verification, but it is unsafe and not recommended.

To successfully make requests with curl, you will need to specify the certificate by using the --cacert parameter:

$ curl --cacert /home/tigergraph/nginx-selfsigned.crt https://localhost:44240 <!doctype html><html lang="en"><head><meta charset="utf-8"><title>GraphStudio</title><base href="/"><meta name="viewport" content="width=device-width,initial-scale=1"><link rel="icon" type="image/x-icon" href="favicon.ico"><link href="styles.d67299ba9f5d73aecbe2.bundle.css" rel="stylesheet"/></head><body class="mat-typography"><app-root></app-root><script type="text/javascript" src="inline.4aae6a8088c30a61d5b0.bundle.js"></script><script type="text/javascript" src="polyfills.c9b879328f3396b2bbe8.bundle.js"></script><script type="text/javascript" src="vendor.5392e4ea4f904cd1658c.bundle.js"></script><script type="text/javascript" src="main.a39087227fcdf478cd2a.bundle.js"></script></body></html>



end of Using Encrypted SSL Connections
Back to Top