This is the procedure on how to replace broken disk from zfs raid array. In this case we simulate to replace zfs raid-1.

  • Check zfs pool status
    #zfs status -v
  • Get the disk information and replaced it (Let’s say that the failed disk is /dev/sdb)
  • Clone the disk partition table from the health disk to replacement
    #sgdisk -R /dev/sdb /dev/sda [newdisk - existing disk]
  • Regenerate the partition UID
    #sgdisk -G /dev/sdb
  • Replacing the disk
    #zpool replace rpool /dev/sdb2 /dev/sdb2
  • Monitoring resilvering process, the
    #zpool status -v
  • For the disk where boot partition resided, make sure to update the grub configuration to mark the replaced disk
    #dpkg-reconfigure grup-pc
    #Make sure to check the new disk.

Thank You

Previous ArticleNext Article

Leave a Reply

Your email address will not be published. Required fields are marked *

A.

Automated Cloud SQL export to Cloud Storage (Updated)

One of my costumers have a production environment in GCP. We helped them to migrate from on premises to GCP. We have setup the automated backup for their Cloud SQL for mysql 5.7 instances. Since the cloud sql backup using snapshot instance and only store/rotate 7 days of recovery point, they need another method to store the mysql backup with more retention.

We then use component below to create an automated export sql file from the Cloud SQL instances and store the on the Cloud Storage.

  • Cloud Scheduler
    As the schduler to run the task
  • Cloud Pub/Sub
    Payload source to trigger the automated export process
  • Cloud Functions
    The Function that run the export process using sql admin api.
  • Cloud Storage
    As the store target for the exported file from the automated process
  • Cloud IAM
    Permission and service account management for the related process/tools
  • Cloud SQL Admin API
    Make sure this enable, in order to run the allow the cloud function to run.

How to

  • Enable Cloud SQL Admin API
  • Create Cloud Storage bucket. In this case, we use Nearline storage type since the purpose is to store a backup. You can also set the lifecycle of the object in the storage. Ex. All the object older than 3 month automatically archive or delete.
  • Get the Cloud SQL instance service account and assign the related service account to the bucket with Storage/Bucket writer.
  • Create New IAM Service account with role to the CloudSQL Admin API
  • Create New Pub/Sub topic, Ex. named as “Backup-Payload”
  • Create a Cloud Function
    • Name: The function name
    • Region : which geographic location you want the CloudFunction run.
    • Memory : The size of memory you want to allocate for the Cloud Function. I chose the smallest one
    • Trigger : Chose method to trigger the Cloudfunction, in this case the newly created Pub/Sub.
    • Runtime: Chose the runtime, in this example we got the function from Nodejs 10
    • Source Code : Use inline editor and input the function from the codes below
    • Function to Execute : input the function name from the codes.
    • Service account for the cloud function that created before.
  • Create Cloud Scheduler
    • Name : The schedule name
    • Frequency : The time, you can use the cron format. In this case, we set the backup to run at 01.00 at morning.
    • Target : We chose the Pub/Sub
    • Topic : Chose the pub/sub topic that we created before
    • Payload : The jeson content about the project databse instance to export and storage target. Find the payload at the end.

Payload for Scheduler

{"project": "PROJECT_ID", "database": "DB_INSTANCE_NAME", "bucket": "gs://bucket-names"}

The Function (Updated)

Converting the time-stampt to format DDMMYYYY

const { google } = require('googleapis')
const { auth } = require('google-auth-library')
const sqladmin = google.sqladmin('v1beta4')

/**
 * Triggered from a Pub/Sub topic.
 * 
 * The input must be as follows:
 * {
 *   "project": "PROJECT_ID",
 *   "database": "DATABASE_NAME",
 *   "bucket": "BUCKET_NAME_WITH_OPTIONAL_PATH_WITHOUT_TRAILING_SLASH"
 * }
 *
 * @param {!Object} event Event payload
 * @param {!Object} context Metadata for the event
 */

var today = new Date();
var dd = today.getDate();
var mm = today.getMonth()+1; 
var yyyy = today.getFullYear();

if(dd<10) 
{
    dd='0'+dd;
} 

if(mm<10) 
{
    mm='0'+mm;
} 

exports.initiateBackup = async (event, context) => {
        const pubsubMessage = JSON.parse(Buffer.from(event.data, 'base64').toString())
        const authRes = await auth.getApplicationDefault()
        const request = {
                auth: authRes.credential,
                project: pubsubMessage['project'],
                instance: pubsubMessage['database'],
                resource: {
                        exportContext: {
                                kind: 'sql#exportContext',
                                fileType: 'SQL',
                                uri: pubsubMessage['bucket'] + '/backup-' +dd+'-'+mm+'-'+yyyy + '.gz'
                        }
                }
        }
        sqladmin.instances.export(request, (err, res) => {
                if (err) console.error(err)
                if (res) console.info(res)
        })
}

package.json
{
        "name": "cloudsql-backups",
        "version": "1.0.0",
        "dependencies": {
                "googleapis": "^45.0.0",
                "google-auth-library": "3.1.2"
        }
}

Reference
https://revolgy.com/blog/how-to-automated-long-term-cloud-sql-backups-step-by-step-guide/

M.

Mariadb 10.4.X Memory Leak Issue

Recently, I’ve helped customer to migrate their service from single VM to 1 vm per-service. I am able to manage minimum downtime when migrating the web data and database data to each VM. All VM using latest OS and latest stack version in the repository’s.

All new VM and the dedicated service is working as expected. Then we found increase of throughput in the front-end. Further investigation shows that there a bottleneck in from the database vm, where we found an “Out of Memory”. From the log and monitoring statistic, i found that the memory usage increase dramatically once the mysql have high access but not releasing the memory until the mysqld service OOM. Even though on old environment, there is no such issue with the same configuration. By the way, old database use MariaDB 10.2 and the new one is using MariaDB 10.4.12

I have tried to tune the configuration to reduce the memory footprint, reduce the connection also tune the system kernel. But the same issue keep happening.

After searching on the web about this unexpected behavior, i got some pople that have the same issue with but there is no specific solution.
From the bug report Old Mariadb, 10.2-3 there is a memory leak issue with “Unresolved” status. Here’s the link.

Further reading result, this condition may caused by the memory allocation handler. Using this keyword i manage to found someone that have the same issue and able to resolved it by changing the memory allocation library of their MariaDB server. Please use the link to find the related infromation here.

Based on the server status, i found that the MariaDB 10.4 still using system default for their memory allocation.

MariaDB [(none)]> SHOW VARIABLES LIKE 'version_malloc_library';
+------------------------+--------+
| Variable_name          | Value  |
+------------------------+--------+
| version_malloc_library | system |
+------------------------+--------+
1 row in set (0.00 sec)

Then i am install the jmalloc library to the system so we can use it as memory allocation library on MariaDB 10.4

apt-get update apt-get 
install -y libjemalloc-dev

Then update the env configuration to use new jmalloc library for MariaDB.

cd /etc/systemd/system/mariadb.service.d/
Edit any .conf in this directory by adding information at the end-line.
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1"
systemctl daemon-reload
systemctl stop mariadb && systemctl start mariadb

Validate the new memory allocation library.

MariaDB [(none)]> SHOW VARIABLES LIKE 'version_malloc_library';
+------------------------+-------------------+
 | Variable_name          | Value             |
+------------------------+-------------------+
 | version_malloc_library | jemalloc 3.6.0-11 |
+------------------------+-------------------+
1 row in set (0.001 sec)

Thank You.