Skip to content

ScopeDB/ScopeDB Installation Guideline

STAGE 1: double write

  1. get the infra ready
  • estimate how much is needed for scopeDB ingest. compare the log line number to AP1 and use AP1 machine specs to estimate scopedb-ingest and guancedb-insert

  • for double-write stage, we don't need infra for query pods, so we can schedule all query related pods to one single node

  • we might need to increase the rate limit of s3 bucket operation APIs (GET, PUT, LIST, DELETE and Multipart PUT). Refer to this worklog

  1. make sure self-observe is ready for scopeDB.

we need to enable opentelemetry collector for the datakits in the cluster. we need to add the following content to the datakit configmap

yaml
  opentelemetry.conf: |
    [[inputs.opentelemetry]]
      [inputs.opentelemetry.http]
       enable = true
       http_status_ok = 200
       trace_api = "/otel/v1/trace"
       metric_api = "/otel/v1/metric"
       logs_api = "/otel/v1/logs"
      [inputs.opentelemetry.grpc]
       trace_enable = true
       metric_enable = true
       addr = "0.0.0.0:4317"

and mount the configmap

yaml
            - mountPath: /usr/local/datakit/conf.d/opentelemetry/opentelemetry.conf
              name: datakit-conf
              subPath: opentelemetry.conf

then in the datakit daemonset yaml file we need to open up the otel port

yaml
            - containerPort: 4317
              hostPort: 4317
              name: otlp
              protocol: TCP

then we will enable zipkin collector by adding zipkin into the ENV_DEFAULT_ENABLED_INPUTS env var

yaml
env:
  - name: ENV_DEFAULT_ENABLED_INPUTS
    value: cpu, disk, diskio, mem, xxx, zipkin

then we will increase the request rate limit, default is 20, we will increase it to 200

yaml
env:
  - name: ENV_REQUEST_RATE_LIMIT
    value: 200
  1. go into postgres and run following command to create scopedb user

Since the password of postgres will be used in a connection string later, we have two options

  1. generate a password without special charactors, all other charactors execpt A–Z a–z 0–9 - . _ ~ are considered special charactors.
  2. generate a password without special charactors, but later we use percent-encoding in the connection string.
sql
CREATE DATABASE scopedb_meta;
CREATE USER scopedb WITH PASSWORD 'your-secure-password';
GRANT ALL PRIVILEGES ON DATABASE scopedb_meta TO scopedb;
ALTER DATABASE scopedb_meta OWNER TO scopedb;
  1. create the following secrets in guancedb namespace
yaml
apiVersion: v1
kind: Secret
metadata:
  name: scopedb
stringData:
  SERVER_META_SERVICE_URL: "postgres://scopedb:[password]@[host]:[port]/scopedb_meta"
  SERVER_DEFAULT_STORAGE_ACCESS_KEY_ID: ""
  SERVER_DEFAULT_STORAGE_SECRET_ACCESS_KEY: ""
  1. deploy using helmist repo
  • closely monitor status of the deployment, look for anomelies (err, crashloop backoff status...)
  • check the self-observe and make sure we get the monitor data in
  1. look at kodo and kodo-x resources, imagine that the pressure on them gonna be two folds if we enable double_write, do we need to give these two more resources?

  2. make sure all the scopedb realted NSQ channels are empty, if not, clean them

  3. run this mysql command

sql
INSERT INTO df_core.main_es_instance
(id, uuid, dbType, name, host, `authorization`, configJSON, isParticipateElection, wsCount, provider, version, timeout, extend, versionType, isCustom, status, creator, updator, createAt, deleteAt, updateAt)
VALUES(100, 'scopdedb_12dsad3', 'scopedb', 'scopedb', '', '{}', '{"read": {"host": "http://guance-select.guancedb:8481"}, "write": {"host": "http://guance-insert.guancedb:8480"}, "version": "v1"}', 0, 0, '', '', '60s', NULL, 'free', 0, 0, 'sys', 'sys', -1, -1, -1);
  1. configure new workspaces to read/write scopedb

Connect to MySQL and execute the following SQL to enable new workspaces to use ScopeDB:

sql
UPDATE main_es_instance SET isParticipateElection = 1 WHERE uuid = "scopdedb_12dsad3";
  1. enable the double_write config on both kodo and kodo-x
bash
kubectl -n forethought-kodo edit cm kodo
kubectl -n forethought-kodo edit cm kodo-x
kubectl -n forethought-kodo edit cm kodo-x-scan

add this part

yaml
global:
  double_write_scopedb: true

remove the following content (kodo-inner also needs to remove this configuration, which may not exist or value is false):

yaml
global:
  double_write_doris: true
  double_query_doris: true

restart the deployments to make the new config effective

bash
kubectl -n forethought-kodo rollout restart deploy kodo
kubectl -n forethought-kodo rollout restart deploy kodo-x
kubectl -n forethought-kodo rollout restart deploy kodo-inner
kubectl -n forethought-kodo rollout restart deploy kodo-x-scan
  1. verify dual write is working properly

You can access NSQ admin through NodePort service or Ingress by configuring private domain to proxy the nsqadmin service under middleware namespace, then access nsqadmin web interface through browser.

You should observe:

  • A new group of topics named df_logging_scopedb has been added
  • Each topic has consumers (connection count > 0)
  1. ingest HPA and cluster-autoscaler for scopedb-ingest

STAGE 2: double query

  1. get the infra ready for query workloads

  2. determine maximum data retention days for existing workspaces

This step determines how long the dual write period should be. Query the retention days for different data types in each workspace to determine the required dual write duration.

Connect to MySQL and execute the following SQL. If no data exceeds 30 days, gradually reduce from 30 days until you find the maximum storage days, which becomes the maximum dual write duration:

sql
-- Query workspaces with retention > 30 days, list them for comparison
select
    uuid, name,
    durationSet -> "$.logging",
    durationSet -> "$.tracing",
    durationSet -> "$.rum",
    durationSet -> "$.apm"
from main_workspace
where
    cast(REPLACE(REPLACE(durationSet -> "$.logging", 'd', ''), '"', '') as UNSIGNED) > 30 or
    cast(REPLACE(REPLACE(durationSet -> "$.tracing", 'd', ''), '"', '') as UNSIGNED) > 30 or
    cast(REPLACE(REPLACE(durationSet -> "$.rum", 'd', ''), '"', '') as UNSIGNED) > 30 or
    cast(REPLACE(REPLACE(durationSet -> "$.apm", 'd', ''), '"', '') as UNSIGNED) > 30;

select
    workspaceUUID,
    main_workspace.name as workspaceName,
    biz_logging_index_cfg.name as indexName,
    biz_logging_index_cfg.duration
from biz_logging_index_cfg
inner join main_workspace
    on biz_logging_index_cfg.workspaceUUID = main_workspace.uuid
where cast(REPLACE(REPLACE(duration, 'd', ''), '"', '') as UNSIGNED) > 30;
  1. import metering data after sufficient dual write duration

After the dual write time exceeds the time queried in step 3, import metering and other data from Doris to ScopeDB.

Data Migration

Execute the following multiple curl requests in kodo-inner:

bash
# 1. Migrate metering data
curl --request POST \
  --url http://127.0.0.1:9527/v1/data_migration \
  --header 'Content-Type: application/json' \
  --data '{
    "src_db_uuid": "dorisxxx",
    "dst_db_uuid": "scopedbxxx",
    "data_type": "metering",
    "time_range": [
        1704038400000,
        1752062400000
    ],
    "ws_uuids": []
}'

# 2. Migrate system workspace audit events
curl --request POST \
  --url http://127.0.0.1:9527/v1/data_migration \
  --header 'Content-Type: application/json' \
  --data '{
    "src_db_uuid": "dorisxxx",
    "dst_db_uuid": "scopedbxxx",
    "data_type": "AE",
    "time_range": [
        1704038400000,
        1752062400000
    ],
    "ws_uuids": []
}'

# 3. Migrate tenant-specific audit events
curl --request POST \
  --url http://127.0.0.1:9527/v1/data_migration \
  --header 'Content-Type: application/json' \
  --data '{
    "src_db_uuid": "dorisxxx",
    "dst_db_uuid": "scopedbxxx",
    "data_type": "TAE",
    "time_range": [
        1704038400000,
        1752062400000
    ],
    "ws_uuids": []
}'

# 4. Migrate infrastructure labels
curl --request POST \
  --url http://127.0.0.1:9527/v1/data_migration \
  --header 'Content-Type: application/json' \
  --data '{
    "src_db_uuid": "dorisxxx",
    "dst_db_uuid": "scopedbxxx",
    "data_type": "OL",
    "time_range": [
        1704038400000,
        1752062400000
    ],
    "ws_uuids": []
}'

Migration Parameters

ParameterDescriptionNotes
src_db_uuidMigration source, usually Doris instance UUIDCan be obtained from MySQL df_core.main_es_instance table
dst_db_uuidMigration target, ScopeDB instance UUIDCan be obtained from MySQL df_core.main_es_instance table
data_typeSpecific data type to migrate4 available options:
- metering: Metering data
- AE: System workspace audit events
- TAE: Tenant-specific audit events
- OL: Infrastructure labels
time_rangeMigration data time range [start~end]Start time: Earliest data time in migration source
End time: Usually current time
ws_uuidsIgnored, no practical useLeave as empty array

Verify Migration Completion

Each curl request returns a traceID. For example:

json
{"content":"task success! traceID: d24ss5tv3jla72mfebpg"}

Use the traceID to search migration logs in kodo container logs. When you find the "end" keyword, it indicates the migration task is complete.

  1. switch query engine to scopedb for all workspaces

Connect to MySQL and execute the one of following SQL:

sql
# Update all workspaces to use scopedb as query engine
UPDATE main_workspace SET datastore = JSON_SET(datastore, '$.query_engine_selected', 'scopedb');

# Fill in the wksp UUID and update that workspace to use scopedb as query engine
UPDATE main_workspace SET datastore = JSON_SET(datastore, '$.query_engine_selected', 'scopedb')
WHERE uuid = "wksp_xxxx";

Restart all services in forethought-kodo namespace after the configuration change.

STAGE 3: cutover to scopedb

After switching the query engine to ScopeDB for a period of time with no customer complaints, stop dual write and migrate all data read/write to ScopeDB.

  1. update all workspaces to read/write only with scopedb

Connect to MySQL and execute the following SQL:

sql
# Find workspace IDs and update database
UPDATE main_workspace SET datastore = CAST(REPLACE(CAST(datastore AS CHAR), 'doris', 'scopedb') AS JSON);
UPDATE main_workspace SET esInstanceUUID = "scopdedb_12dsad3";
  1. clean up redis cache

Connect to Redis and delete key information:

bash
# Connect to redis, find keys and delete
redis-cli -h redis_host -p 6379 -a "password" KEYS "wkspInfo:*" | xargs redis-cli -h redis_host -p 6379 -a "password" DEL
redis-cli -h redis_host -p 6379 -a "password" DEL "doris_store_info"
redis-cli -h redis_host -p 6379 -a "password" DEL "scopedb_store_info"
  1. disable doris instance

Connect to MySQL and execute the following SQL:

sql
# Find doris instance uuid
SELECT * FROM main_es_instance LIMIT 10;

# Set doris instance status to disabled
UPDATE main_es_instance SET status = 3 WHERE uuid = 'xxx';
  1. remove dual write configuration

Remove dual write configuration in forethought-kodo namespace for kodo, kodo-x, and kodo-x-scan ConfigMaps:

yaml
global:
  # Remove this configuration
  double_write_scopedb: true
  1. restart all services in forethought-kodo namespace
bash
kubectl -n forethought-kodo rollout restart deploy kodo
kubectl -n forethought-kodo rollout restart deploy kodo-x
kubectl -n forethought-kodo rollout restart deploy kodo-inner
kubectl -n forethought-kodo rollout restart deploy kodo-x-scan
kubectl -n forethought-kodo rollout restart deploy kodo-asynq-client
kubectl -n forethought-kodo rollout restart deploy kodo-nginx
kubectl -n forethought-kodo rollout restart sts kodo-servicemap
kubectl -n forethought-kodo rollout restart deploy kodo-ws
kubectl -n forethought-kodo rollout restart sts kodo-x-backuplog

After restart completion, notify Zhang Lei that this site's Doris has been taken down.

  1. clean up unused nsq channels

Access the nsqadmin web interface and delete NSQ channels that are no longer needed:

  • Channels/topics with names containing scopedb can be deleted
  • Some topics may have multiple channels - be careful not to delete channels that are still in use
  • To determine if a channel is in use: Check the Channel Message Queues table at https://<nsqadmin-host>/topics/<topic-name> interface. If the connections count is 0, it means the channel is not being used and can be safely deleted
  1. release doris resources

Release all Doris resources, which may include:

  • Servers (including FE, BE, GuanceDB for logs machines; if GuanceDB for logs is deployed using containers, corresponding resources also need to be released using helm uninstall)
  • S3 buckets
  • IAM users AK/SK
  • Internal DNS resolution

FAQ

Migrating from Opensearch to ScopeDB

If you want to use ScopeDB for initial deployment, you can first deploy Opensearch, then switch to ScopeDB.

Initial Setup

After deployment completion, insert a ScopeDB instance record into the MySQL database used by the Guance platform:

sql
INSERT INTO df_core.main_es_instance
(id, uuid, dbType, name, host, authorization, configJSON, isParticipateElection, wsCount, provider, version, timeout, extend, versionType, isCustom, status, creator, updator, createAt, deleteAt, updateAt)
VALUES(100, 'scopdedb_12dsad3', 'scopedb', 'scopedb', '', '{}', '{"read": {"host": "http://guance-select.guancedb:8481"}, "write": {"host": "http://guance-insert.guancedb:8480"}, "version": "v1"}', 0, 0, '', '', '60s', NULL, 'free', 0, 0, 'sys', 'sys', -1, -1, -1);

Update Workspace Configuration

sql
# Find workspace ID and update database
UPDATE main_workspace SET datastore = CAST(REPLACE(CAST(datastore AS CHAR), 'es', 'scopedb') AS JSON)
WHERE uuid = 'wksp_305c27ea445143xxxxd5adc01e';

UPDATE main_workspace SET esInstanceUUID = "scopdedb_12dsad3"
WHERE uuid = 'wksp_305c27ea44514397a1b5415fd5adc01e';

Clean Up Redis

bash
# Connect to redis, find keys and delete
redis-cli -h redis_host -p 6379 -a "password" KEYS "wkspInfo:*" | xargs redis-cli -h redis_host -p 6379 -a "password" DEL
redis-cli -h redis_host -p 6379 -a "password" DEL "doris_store_info"

Disable Elasticsearch Instance

sql
# Find ES instance uuid
SELECT * FROM main_es_instance LIMIT 10;

# Set ES instance status to disabled
UPDATE main_es_instance SET status = 3 WHERE uuid = 'xxx';

Restart Services

Restart all services in forethought-kodo namespace:

  • kodo
  • kodo-asynq-client
  • kodo-inner
  • kodo-nginx
  • kodo-servicemap
  • kodo-ws
  • kodo-x
  • kodo-x-backuplog
  • kodo-x-scan