1. TiDB-Operator 备份到 Minio

news/2024/5/19 16:27:37 标签: tidb, minio, 数据库, 备份, 云原生

minio_s3_0">创建minio s3

  1. 初始化minio
minio server $HOME/operator/data --console-address :9090
  1. 设置region为上海

tidboperatorCR_6">创建tidb-operator备份CR

1.备份CR配置文件backup-s3.yaml信息

apiVersion: pingcap.com/v1alpha1
kind: Backup
metadata:
  name: backup2s3-dev
  namespace: tidb-admin
  labels:
    user: paul
spec:
  ## Describes the compute resource requirements and limits of Backup.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      cpu: 500m
      memory: 512Mi

  ## List of environment variables to set in the container, like v1.Container.Env.
  ## Note that the following builtin env vars will be overwritten by values set here.
  ## - S3_PROVIDER
  ## - S3_ENDPOINT
  ## - AWS_REGION
  ## - AWS_ACL
  ## - AWS_STORAGE_CLASS
  ## - AWS_DEFAULT_REGION
  ## - AWS_ACCESS_KEY_ID
  ## - AWS_SECRET_ACCESS_KEY
  ## - GCS_PROJECT_ID
  ## - GCS_OBJECT_ACL
 ## From is the TidbCluster to be backed up.
  ## It takes high precedence than spec in BR. If `from` not set, cluster in BR will be backed up.
  #from:
    ## Host is the address of the TidbCluster to be backed up, which is the service name of the TidbCluster, such as `basic-tidb`.
    #host: alpha-tidb
    ## Port is the port of the TidbCluster to be backed up.
    #port: 4000
    ## User is the accessing user of the TidbCluster to be backed up.
    #user: root
    ## SecretName is the secret that contains the password of the accessing user of the TidbCluster to be backed up.
    # secretName: sh.helm.release.v1.tidb-operator.v1 
    ## TLSClientSecretName is the name of secret which stores tidb server client certificate.
    ## Defaults to nil.
    # tlsClientSecretName: ""
  backupType: full
  backupMode: snapshot
  ## TikvGCLifeTime specifies the safe gc life time for Backup.
  ## The time limit during which data is retained for each GC, in the format of Go Duration.
  ## When a GC happens, the current time minus this value is the safe point.
  ## Defaults to 72h.
  tikvGCLifeTime: 72h

  s3:
    provider: aws
    secretName: minio-secret
    bucket: tidbuss
    prefix: tidb/s3
    endpoint: http://192.168.1.2:9000

  ## StorageSize is the PV size specified for the backup operation.
  ## This value must be greater than the size of the TidbCluster to be backed up.
  ## Defaults to 100Gi.
  storageSize: "100Gi"


  ## BR configuration.
  ## Ref: https://docs.pingcap.com/tidb/stable/backup-and-restore-tool
  br:
    ## Cluster specifies name of TidbCluster to be backed up.
    cluster: "alpha"
    ## Namespace specifies namespace of TidbCluster to be backed up.
    clusterNamespace: "tidb-admin"
    ## LogLevel is the log level. Defaults to `info`.
    # logLevel: "info"
    ## StatusAddr is the HTTP listening address for the status report service. Defaults to empty.
    # statusAddr: ""
    ## Concurrency is the size of thread pool on each node that execute the backup task.
    ## Defaults to 4.
    concurrency: 4
    ## RateLimit is the rate limit of the backup task, MB/s per node.
    ## If set to 4, the speed limit is 4 MB/s.The speed limit is not set by default.
    # rateLimit: 0
    ## TimeAgo presents back up the data before `timeAgo`, e.g. 1m, 1h. Defaults to empty.
    # timeAgo: 1m
    ## Checksum specifies whether to verify the files after the backup is completed.
    ## Defaults to `true``.
    # checksum: true
    ## CheckRequirements specifies whether to check requirements before backup
    # checkRequirements: true
    ## SendCredToTikv specifies whether the BR process passes its AWS or GCP privileges to the TiKV process.
    ## Defaults to `true``.
    sendCredToTikv: true
    ## OnLine specifies whether online during restore. Defaults to false.
    # onLine: false
    ## Options specifies the extra arguments that BR supports. These options has highest priority.
    # options: []

  

  ## ToolImage specifies the tool image used in `Backup`, which supports BR and Dumpling images.
  ## For examples `spec.toolImage: pingcap/br:v5.2.0` or `spec.toolImage: pingcap/dumpling:v5.2.0`
  ## For BR image, if it does not contain tag, Pod will use image 'ToolImage:${TiKV_Version}'.
  toolImage: pingcap/br:v6.5.5

  ## ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images.
  ## If private registry is used, imagePullSecrets may be set.
  ## You can also set this in service account.
  ## Ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  # imagePullSecrets:
  # - name: secretName

  ## TableFilter specifies tables that match the table filter rules for BR or Dumpling.
  ## Ref: https://docs.pingcap.com/tidb/stable/table-filter
  ## Defaults to empty.
  # tableFilter: []

  ## Affinity for Backup pod scheduling
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  # affinity: {}

  ## UseKMS to decrypt the secrets. Defaults to false.
  useKMS: false

  ## ServiceAccount Specify service account of Backup.
  serviceAccount: "tidb-backup-manager"

  ## CleanPolicy specifies whether to clean backup data when the Backup CR is deleted, if not set, the backup data will be retained.
  ## `Retain` represents that the backup data will be retained when the Backup CR is deleted.
  ## `OnFailure` represents that the backup data will be cleaned only for the failed backups when the Backup CR is deleted.
  ## `Delete` represents that the backup data will be cleaned when the Backup CR is deleted.
  cleanPolicy: Retain
  1. 执行创建备份
kubectl -n tidb-admin apply -f  backup-s3.yaml

备份错误与异常排查

  1. 错误日志如下:
E1105 10:41:10.821663       8 manager.go:408] Get backup metadata for backup files in s3://tidbuss/tidb/s3 of cluster tidb-admin/backup2s3-dev failed, err: read backup meta from bucket tidbuss and prefix tidb/s3: blob (key "backupmeta") (code=Unknown): BadRequest: Bad Request
	status code: 400, request id: 1794B3FFD6488588, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8
I1105 10:41:10.841852       8 backup_status_updater.go:128] Backup: [tidb-admin/backup2s3-dev] updated successfully
error: read backup meta from bucket tidbuss and prefix tidb/s3: blob (key "backupmeta") (code=Unknown): BadRequest: Bad Request
	status code: 400, request id: 1794B3FFD6488588, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e
  1. tidb-operator br备份minio s3异常排查
    通过对tidb-operator项目代码进行分析,定位到util.GetBRMetaData方法,文件位于cmd/backup-manager/app/util/util.go
// 编写单元测试用例
func TestGetBRMetaData(t *testing.T) {
	ctx := context.Background()
	os.Setenv("AWS_ACCESS_KEY_ID", "tidb")
	os.Setenv("AWS_SECRET_ACCESS_KEY", "Jianxin123")
	provider := v1alpha1.StorageProvider{
		S3: &v1alpha1.S3StorageProvider{
			Provider:   "aws",
			Bucket:     "tidbuss",
			Prefix:     "tidb/s3",
			Endpoint:   "http://192.168.1.2:9000",
			SecretName: "minio-secrete",
		},
	}
	_, err := GetBRMetaData(ctx, provider)
	log.Fatalln(err)

}

// 修改原始方法,通过debug日志显示根本原因
// GetBRMetaData get backup metadata from cloud storage
func GetBRMetaData(ctx context.Context, provider v1alpha1.StorageProvider) (*kvbackup.BackupMeta, error) {
	s, err := util.NewStorageBackend(provider, &util.StorageCredential{})
	if err != nil {
		return nil, err
	}
	defer s.Close()

	var metaData []byte
	// use exponential backoff, every retry duration is duration * factor ^ (used_step - 1)
	backoff := wait.Backoff{
		Duration: time.Second,
		Steps:    6,
		Factor:   2.0,
		Cap:      time.Minute,
	}
	fmt.Println("bucket", s.GetBucket())
	//	_, err = s.Attributes(ctx, "backupmeta")
	obj, err := s.List(&blob.ListOptions{Prefix: "tidb/s3"}).Next(ctx)
	fmt.Println("xx bucket", err, obj)
	readBackupMeta := func() error {
		exist, err := s.Exists(ctx, "backupmeta")
		if err != nil {
			return err
		}
		fmt.Println("IS existed", exist)
		if !exist {
			return fmt.Errorf("%s not exist", constants.MetaFile)
		}
		metaData, err = s.ReadAll(ctx, constants.MetaFile)
		if err != nil {
			return err
		}
		return nil
	}
	fmt.Println("xxxx", readBackupMeta())

	isRetry := func(err error) bool {
		return !strings.Contains(err.Error(), "not exist")
	}
	err = retry.OnError(backoff, isRetry, readBackupMeta)
	if err != nil {
		return nil, errors.Annotatef(err, "read backup meta from bucket %s and prefix %s", s.GetBucket(), s.GetPrefix())
	}

	backupMeta := &kvbackup.BackupMeta{}
	err = proto.Unmarshal(metaData, backupMeta)
	if err != nil {
		return nil, errors.Annotatef(err, "unmarshal backup meta from bucket %s and prefix %s", s.GetBucket(), s.GetPrefix())
	}
	return backupMeta, nil
}

在修改后的方法中,定位到Bucket的Exists、Attribute的方法无法获得有用排查错误信息,转而采用List方法,李处minio s3存储的backupmeta文件,错误日志提示为缺乏region信息。

go test -timeout 30s -run ^TestGetBRMetaData$
bucket tidbuss
xx bucket blob (code=Unknown): AuthorizationHeaderMalformed: The authorization header is malformed; the region is wrong; expecting 'shanghai'.
        status code: 400, request id: 1794B4674A2D5A48, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8 <nil>
xxxx blob (key "backupmeta") (code=Unknown): BadRequest: Bad Request
        status code: 400, request id: 1794B4674B073F88, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8

测试用例中的数据与backup-s3.yaml中的s3相关配置一直,其单元测试复现结果指向region配置项。

  1. 在单元测试中加入region配置
func TestGetBRMetaData(t *testing.T) {
	ctx := context.Background()
	os.Setenv("AWS_ACCESS_KEY_ID", "tidb")
	os.Setenv("AWS_SECRET_ACCESS_KEY", "Jianxin123")
	provider := v1alpha1.StorageProvider{
		S3: &v1alpha1.S3StorageProvider{
			Provider:   "aws",
			Region:     "shanghai",
			Bucket:     "tidbuss",
			Prefix:     "tidb/s3",
			Endpoint:   "http://192.168.1.2:9000",
			SecretName: "minio-secrete",
		},
	}
	_, err := GetBRMetaData(ctx, provider)
	log.Fatalln(err)

}

重跑该测试用例,显示测试通过,说明配置文件中s3相关内容中region为必填字段,需与minio的region配置保持一致

hbu@Pauls-MacBook-Air util % go test -timeout 30s -run ^TestGetBRMetaData$
bucket tidbuss
xx bucket EOF <nil>
IS existed true
xxxx <nil>
IS existed true
2023/11/05 18:52:00 <nil>
exit status 1
FAIL    github.com/pingcap/tidb-operator/cmd/backup-manager/app/util    0.442s

解决问题

  1. 修改tidb-operator备份到s3配置文件backup2s3-dev.yaml,在s3配置中添加region字段。
apiVersion: pingcap.com/v1alpha1
kind: Backup
metadata:
  name: backup2s3-dev
  namespace: tidb-admin
  labels:
    user: paul
spec:
  ## Describes the compute resource requirements and limits of Backup.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      cpu: 500m
      memory: 512Mi

  ## List of environment variables to set in the container, like v1.Container.Env.
  ## Note that the following builtin env vars will be overwritten by values set here.
  ## - S3_PROVIDER
  ## - S3_ENDPOINT
  ## - AWS_REGION
  ## - AWS_ACL
  ## - AWS_STORAGE_CLASS
  ## - AWS_DEFAULT_REGION
  ## - AWS_ACCESS_KEY_ID
  ## - AWS_SECRET_ACCESS_KEY
  ## - GCS_PROJECT_ID
  ## - GCS_OBJECT_ACL
 ## From is the TidbCluster to be backed up.
  ## It takes high precedence than spec in BR. If `from` not set, cluster in BR will be backed up.
  #from:
    ## Host is the address of the TidbCluster to be backed up, which is the service name of the TidbCluster, such as `basic-tidb`.
    #host: alpha-tidb
    ## Port is the port of the TidbCluster to be backed up.
    #port: 4000
    ## User is the accessing user of the TidbCluster to be backed up.
    #user: root
    ## SecretName is the secret that contains the password of the accessing user of the TidbCluster to be backed up.
    # secretName: sh.helm.release.v1.tidb-operator.v1 
    ## TLSClientSecretName is the name of secret which stores tidb server client certificate.
    ## Defaults to nil.
    # tlsClientSecretName: ""
  backupType: full
  backupMode: snapshot
  ## TikvGCLifeTime specifies the safe gc life time for Backup.
  ## The time limit during which data is retained for each GC, in the format of Go Duration.
  ## When a GC happens, the current time minus this value is the safe point.
  ## Defaults to 72h.
  tikvGCLifeTime: 72h

  s3:
    provider: aws
    secretName: minio-secret
    region: shanghai
    bucket: tidbuss
    prefix: tidb/s3
    endpoint: http://192.168.1.2:9000

  ## StorageSize is the PV size specified for the backup operation.
  ## This value must be greater than the size of the TidbCluster to be backed up.
  ## Defaults to 100Gi.
  storageSize: "100Gi"


  ## BR configuration.
  ## Ref: https://docs.pingcap.com/tidb/stable/backup-and-restore-tool
  br:
    ## Cluster specifies name of TidbCluster to be backed up.
    cluster: "alpha"
    ## Namespace specifies namespace of TidbCluster to be backed up.
    clusterNamespace: "tidb-admin"
    ## LogLevel is the log level. Defaults to `info`.
    # logLevel: "info"
    ## StatusAddr is the HTTP listening address for the status report service. Defaults to empty.
    # statusAddr: ""
    ## Concurrency is the size of thread pool on each node that execute the backup task.
    ## Defaults to 4.
    concurrency: 4
    ## RateLimit is the rate limit of the backup task, MB/s per node.
    ## If set to 4, the speed limit is 4 MB/s.The speed limit is not set by default.
    # rateLimit: 0
    ## TimeAgo presents back up the data before `timeAgo`, e.g. 1m, 1h. Defaults to empty.
    # timeAgo: 1m
    ## Checksum specifies whether to verify the files after the backup is completed.
    ## Defaults to `true``.
    # checksum: true
    ## CheckRequirements specifies whether to check requirements before backup
    # checkRequirements: true
    ## SendCredToTikv specifies whether the BR process passes its AWS or GCP privileges to the TiKV process.
    ## Defaults to `true``.
    sendCredToTikv: true
    ## OnLine specifies whether online during restore. Defaults to false.
    # onLine: false
    ## Options specifies the extra arguments that BR supports. These options has highest priority.
    # options: []

  

  ## ToolImage specifies the tool image used in `Backup`, which supports BR and Dumpling images.
  ## For examples `spec.toolImage: pingcap/br:v5.2.0` or `spec.toolImage: pingcap/dumpling:v5.2.0`
  ## For BR image, if it does not contain tag, Pod will use image 'ToolImage:${TiKV_Version}'.
  toolImage: pingcap/br:v6.5.5

  ## ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images.
  ## If private registry is used, imagePullSecrets may be set.
  ## You can also set this in service account.
  ## Ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  # imagePullSecrets:
  # - name: secretName

  ## TableFilter specifies tables that match the table filter rules for BR or Dumpling.
  ## Ref: https://docs.pingcap.com/tidb/stable/table-filter
  ## Defaults to empty.
  # tableFilter: []

  ## Affinity for Backup pod scheduling
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  # affinity: {}

  ## UseKMS to decrypt the secrets. Defaults to false.
  useKMS: false

  ## ServiceAccount Specify service account of Backup.
  serviceAccount: "tidb-backup-manager"

  ## CleanPolicy specifies whether to clean backup data when the Backup CR is deleted, if not set, the backup data will be retained.
  ## `Retain` represents that the backup data will be retained when the Backup CR is deleted.
  ## `OnFailure` represents that the backup data will be cleaned only for the failed backups when the Backup CR is deleted.
  ## `Delete` represents that the backup data will be cleaned when the Backup CR is deleted.
  cleanPolicy: Retain
  1. 删除原有运行Backup CR
kubectl -n tidb-admin delete -f backup-s3.yaml
  1. minio中删除原有备份地址中的文件

重新运行备份

hbu@Pauls-MacBook-Air backup % kubectl -n tidb-admin apply -f  backup-s3.yaml
backup.pingcap.com/backup2s3-dev created

检查结果

hbu@Pauls-MacBook-Air data % kubectl  -n tidb-admin get pod                          
NAME                                       READY   STATUS      RESTARTS      AGE
alpha-discovery-68588cd598-k5m56           1/1     Running     1 (12d ago)   15d
alpha-pd-0                                 1/1     Running     1 (12d ago)   15d
alpha-tidb-0                               2/2     Running     2 (12d ago)   15d
alpha-tikv-0                               1/1     Running     1 (12d ago)   15d
backup-backup2s3-dev-l5rq4                 0/1     Completed   0             33s
tidb-controller-manager-54694444b9-ncj8z   1/1     Running     6             15d
hbu@Pauls-MacBook-Air data % kubectl  -n tidb-admin get backup
NAME              TYPE   MODE       STATUS     BACKUPPATH             BACKUPSIZE   COMMITTS             LOGTRUNCATEUNTIL   AGE
backup2s3-dev     full   snapshot   Complete   s3://tidbuss/tidb/s3   271 kB       445430282047455233                      42s

修改后的备份配置文件,成功触发tidb-operator备份到s3兼容存储minio

总结

如果参照TiDB Operator官方文档,TiDB Operator执行备份到S3兼容存储minio相对容易一些。但是,TiDB Operator业务订制化开发工作需要开发者对相关字段掌握更多,才能更好的排查错误。

另外,AWS S3和Minio毕竟还是两种产品,有关Minio region设置和应用方式,也是开发过程需要关注的功能点。


http://www.niftyadmin.cn/n/5221663.html

相关文章

JIRA部分数据库结构

表jiraissue&#xff08;问题表&#xff09; 字段 数据类型 是否为空 KEY 说明 ID decimal(18,0) NO PRI 主键 pkey varchar(255) YES MUL 查看主键&#xff0c;“项目ID” PROJECT decimal(18,0) YES MUL 项目外键&#xff0c;项目表外键 REPORTER varch…

宕机对独立服务器会有啥影响?

宕机对独立服务器会有啥影响&#xff1f; 一个优秀的网站不仅仅需要好的内容以及架构&#xff0c;而且还需要有性能优质的服务器所支撑&#xff0c;这样才能保证网站正常的运作&#xff0c;然而&#xff0c;若是网站出现宕机的情况则会让独立服务器出现一些不可避免的影响&…

云原生CI/CD流水线发布

文章目录 前言k8s组件与操作流程k8s组件创建pod k8s代码&&打包k8s yamldeploymentservicek8s volumesdemo CIgitlabCI runner CD配置git repository安装argo创建argo cd的配置yamlargocd和helm结合argocd hookargocd 发布 RBACoperatorhelmprometheus && grafn…

数据分享 I 全国各市城镇化率,shapeflie格式,附数据可视化

基本信息. ​数据名称: 城镇化率 数据格式: shapeflie 数据时间: 2006-2020年 数据几何类型: 面 数据坐标系: WGS84坐标系 数据来源&#xff1a;网络公开数据 数据可视化. 安徽省各市2015年城镇化率 广东省各市2015年城镇化率 江西省各市2015年城镇化率

05_属性描述符

05_属性描述符 文章目录 05_属性描述符一、属性描述符是什么&#xff1f;二、属性描述符①&#xff1a;查看属性描述②&#xff1a;设置属性描述符③&#xff1a;案例01.代码实现02.代码实现&#xff08;优化&#xff09; 一、属性描述符是什么&#xff1f; 属性描述符的结构 在…

【日常总结】Swagger 3.0 + 集成 knife4j ,并设置header入参

一、场景 环境&#xff1a; 二、问题 思路 &#xff1a; 三、解决方案 &#xff08;推荐&#xff09; Stage 1&#xff1a;接入knife4j 依赖 Stage 2&#xff1a;修改 yaml 配置 Stage 3&#xff1a;修改 swagger 3 配置文件 Stage 4&#xff1a;查看效果 Swagger UI …

Android中根据字符串动态获取资源文件ID

有时候想在代码运行的时候根据资源名称去获取id从而使用调用资源文件。 Resource中的getIdentifier()可以解决这个问题 public int getIdentifier(String name, String defType, String defPackage) {return mResourcesImpl.getIdentifier(name, defType, defPackage);} 使用…

俄罗斯山东同乡会会长赵卫星一行莅临百华鞋业考察交流

11月25日&#xff0c;俄罗斯山东同乡会会长赵卫星等专家与市侨联副主席李咏梅&#xff0c;县商务局局长尹纪付、县侨联主席刘志峰、县人才集团总经理刘杰等一行莅临百华鞋业考察调研。百华鞋业总经理郭兴梅全程陪同。 百华鞋业总经理郭兴梅对赵卫星会长一行领导的到来表示热烈…