Register Collector Manager
To connect the collector manager with the Machbase server, register the collector manager with the Machbase server. Execute the following command using machsql.
- manager_name : The name of the collector manager. Duplicate values are not allowed.
- host_addr: The IP address of the server where the collector manager is running.
- host_port: Port number of the server on which collector manager is running.
After registering a collector manager on the Machbase server, you can query the status in the m$sys_collectormanagers table.
In the table, the identifier, name, port number, address, and execution status of the collector manager can be inquired.
After registering the collector manager, create the collector object through the collector manager.
Information about the Collector is stored in the Machbase server and can be retrieved. Execute the following command through machsql to create a collector.
- manager_name : The name of the collector manager that runs the collector.
- collector_name: The name of the collector object.
- path_for_template.tpl: The path to the configuration file for collector. The various sample configuration files are located in the "$MACHBASE_COLLECTOR_HOME/collector" directory. It is recommended to select the desired sample file, modify it, and save it as another file.
Prepare Template File
The template file is a text file that describes the Collector's data source, processing method, and storage method. Sample files are provided in the $MACHBASE_COLLECTOR_HOME/collector directory.
Template File Structure
The template file has a structure of "variable name = value" similar to the Machbase property file. Detailed information of each setting variable is shown in the following table.
Configuration files after Machbase version 3.5 are not backward compatible.
Data collection method
Sets the data collection method. The data collection method is as follows. FILE defaults to a specific file on the device where the collector is installed. SFTP: Remote SFTP file path, SOCKET: Enters socket input data. ODBC : Enters data from database set to ODBC.
Location of log file to be read
The location of the data file to be read. In SFTP mode, you must specify the absolute path of the remote host. Not used in SOKET and ODBC modes. It is also possible to set multiple source files or set them to regular expressions.
|SFTP_HOST||SFTP_HOST||Host Ip Address|
Is set to 22 by default if not set.
Is set to anonymous by default.
Is set to anonymous by default.
Socket port number on which the Collector enters data
Collector socket protocol type
Possible values are TCP and UDP. The default value is TCP.
ODBC mode DSN
ODBC mode query
Query string executed to obtain input data from an ODBC data source
Increased column names in ODBC mode
Only numeric columns are allowed.
External link library pass
Not used yet.
Regular expression file for analyzing input data
Not used in ODBC mode.
Location of Python script files for data preprocessing
Wait time after inputting data
In milliseconds, with a default of 1000.
Table name to be entered
Database IP address to be entered
Database port number
Data input method configuration
Not used as a value for compatibility with past versions.
Whether to automatically generate a table column if it does not exist
If 0, it is not generated. If1, it is generated automatically.
Default value is 1.
Set an operation on the input table. (0: do nothing. 1: truncate the existing table 2: create the table. If an error occurs, write the error to trc and continue 3: drop the table and recreate the table)
Generally recommended to set to 2.
Specifies the encoding of the input data file.
Available values are UTF-8 (default), CP949 (MS949), KSC5601, EUCJP, SHIFTJIS, BIG5 and BG231280.
Determines the order of the input files.
Default value is ASC and DESC is also possible.
Rotation file path configuration
Rotation file number configuration
Rotation file order configuration
Default value is ASC. DESC is also possible.
REGEX_PATH, and PREPROCESS_PATH are the files that the collector refers to at run time. Below is a description of the rgx file set in REGEX_PATH.
Regular expression name
Value that can be modified, but it is better to keep the value because it is stored together in the database.
List of columns in the table
Information on the columns belonging to the table
Regular expressions for data analysis
Regular expression that signifies the end of a record
Regular expression to separate each record. If not set,
COL_LIST describes the information linking the log file to the database column. You must set the result of the regular expression and various information to set the column. Complex log data can be entered into structured table columns using COL_LIST.
String that does not contain spaces.
Column data type
Name of the type.
Refers to the actual specified size of the column. The string specifies a different value depending on the size to be created or created. ((short (6), int (11), long (20), float (17), double (17), datetime -defined), ipv4 (15), ipv6 (45), text (64MB), binary (64MB))
Datetime data format when type is datetime
Internally parses the value using the "strptime" function.
e.g.) 'Aug 19 07:56:16' has the format 'month day hour: minute: second'. Therefore, the format values used are as follows. "% b% d% H:% M:% S"
Whether to create index
Creates LSM or KEYWORD LSM index based on type.
0: Do not create. / 1: Create.
Token number within regular expression
Among the REGEX syntax specified in the regular expression file, the "()" parenthesized area is a token. 0 means the entire record data. After that, it becomes the first token from the first parenthesis.
Below is an example of a syslog.tpl file. The file is provided as a sample in $MACHBASE_COLLECTOR_HOME/collector/syslog.tpl.
The syslog.rgx file is a regular expression file set in the syslog.tpl file. When setting up an rgx file, you can either set it to an absolute path or relative path based on $MACHBASE_COLLECTOR_HOME/collector/regex.
Create the collector "syslog_test" as shown below.
The M$SYS_COLLECTORS table contains information about the registered collectors. The collector with the "RUN_FLAG" column value of 1 is running and if it is 0, the execution is stopped.
To start the registered collector, use the ALTER COLLECTOR statement.
- manager_name : Name of the registered collector manager
- collector_name: The name of the collector to execute.
If an error occurs when executing Collector, you can refer to $MACHBASE_COLLECTOR_HOME/trc/machcollector.trc file for troubleshooting.
When you start collector with the ALTER COLLECTOR statement, you can see that the value of the RUN_FLAG column has changed by one.
When you start the Collector, a log table is created on the database server where the collected data is stored. The values of collector_type, collector_addr, collector_origin, and collector_offset are set to default values. The tmp, host, and msg columns set in the syslog.tpl file are also created.
When you execute a query using machsql, you need to make sure that it is connected to the Machbase server and is running. If the Machbase server and collector are installed on different machines, it may not execute normally if the server to which machsql is connected is collector.
When the Collector is executed, the Collector reads the position of the last data entered and re-executes the data.
Below is a comparison of the last 10 syslog logs with data and input data.
The following is the last 10 data entered into the Machbase server.
You can check whether the collector is executed by the following query.
You can stop the collector with the following command:
Whether the collector is dropped can be confirmed by the following query.
This is used to change the template file after creating the collector and to apply the new contents. The contents of the template file updated at the time of execution are applied. The following example changes the table into "anothertable" instead of the original value.
When you look up the meta table, you can see that the input table has been changed to anothertable.