How to copy items from one DynamoDB to another DynamoDB table using Python on AWS

You can use Python to copy items from one DynamoDB table to another. The same script can be used to copy items between DynamoDB tables in different accounts. Before we proceed with this article, it is assumed that you have a basic understanding of Python. You need not write anything on your own, you would just need to execute the script to get done with the copy operation. If you need to understand the script and the code written in it, then you need to have a basic understanding of Python.

You can execute this script from any machine with access to the internet and Python installed in it. You need to have Python and Boto3 installed on your system. This script is tested with Python 2.7.16, you can try with different versions available in Python 2.7.

AWS Data Pipeline service can also be used to copy items from one DynamoDB table to another, but that is a little tedious process. So, I wrote this script on my own to simplify the task.

Now, let's get started.

Pre-requisites

  1. Basic understanding of Python.
  2. Python 2.7.16 and Boto3 installed on the Linux Server.
  3. AWS Account (Create if you don’t have one).
  4. 'access_key' & 'secret_key' of an AWS IAM User with sufficient/full permissions on DynamoDB. (Click here to learn to create an IAM user with 'access_key' & 'secret_key' on AWS, )

What will we do

  1. Check Prerequisites.
  2. Create a Script.
  3. Execute the Script.

Check Prerequisites

Check Python

python --version

Python Version

Check Pip

pip --version

Pip version

Check Boto3

pip show boto3

Boto3 Version

Create a Script

Create a new file with the following code on your local system. The code is also available on my Github Repo. The following is the link to the code on Github.

Github Link: https://github.com/shivalkarrahul/DevOps/blob/master/aws/python/aws-copy-dynamo-db-table/copy-dynamodb-table.py

File: copy-dynamodb-table.py

import boto3
import os
import sys
import argparse
import datetime


global args
parser = argparse.ArgumentParser()

parser.add_argument('-sa', '--source_aws_access_key_id', required=True, action="store", dest="source_aws_access_key_id",
                    help="Source AWS Account aws_access_key_id", default=None)
parser.add_argument('-ss', '--source_aws_secret_access_key', required=True, action="store", dest="source_aws_secret_access_key",
                    help="Source AWS Account aws_secret_access_key", default=None)
parser.add_argument('-da', '--destination_aws_access_key_id', required=True, action="store", dest="destination_aws_access_key_id",
                    help="Destination AWS Account aws_access_key_id", default=None)
parser.add_argument('-ds', '--destination_aws_secret_access_key', required=True, action="store", dest="destination_aws_secret_access_key",
                    help="Destination AWS Account aws_secret_access_key", default=None)
parser.add_argument('-st', '--sourceTableName', required=True, action="store", dest="sourceTableName",
                    help="Source AWS Account DyanamoDB Table", default=None)
parser.add_argument('-dt', '--destinationTableName', required=True, action="store", dest="destinationTableName",
                    help="Destination AWS Account DyanamoDB Table", default=None) 
args = parser.parse_args()                                                                                                                       

source_aws_access_key_id = args.source_aws_access_key_id
source_aws_secret_access_key = args.source_aws_secret_access_key

destination_aws_access_key_id = args.destination_aws_access_key_id
destination_aws_secret_access_key = args.destination_aws_secret_access_key


sourceTableName=args.sourceTableName 
destinationTableName=args.destinationTableName 

sourceTableExists = "false" 
destinationTableExists = "false" 

print("Printing values")
print("source_aws_access_key_id", source_aws_access_key_id)
print("source_aws_secret_access_key", source_aws_secret_access_key)
print("destination_aws_access_key_id", destination_aws_access_key_id)
print("destination_aws_secret_access_key", destination_aws_secret_access_key)
print("sourceTableName", sourceTableName)
print("destinationTableName", destinationTableName)


timeStamp = datetime.datetime.now()
backupName = destinationTableName + str(timeStamp.strftime("-%Y_%m_%d_%H_%M_%S"))

item_count = 1000 #Specify total number of items to be copied here, this helps when a specified number of items need to be copied
counter = 1 # Don't not change this

source_session = boto3.Session(region_name='eu-west-3', aws_access_key_id=source_aws_access_key_id, aws_secret_access_key=source_aws_secret_access_key)
source_dynamo_client = source_session.client('dynamodb')

target_session = boto3.Session(region_name='eu-west-3', aws_access_key_id=destination_aws_access_key_id, aws_secret_access_key=destination_aws_secret_access_key)
target_dynamodb = target_session.resource('dynamodb')


dynamoclient = boto3.client('dynamodb', region_name='eu-west-3', #Specify the region here
    aws_access_key_id=source_aws_access_key_id,  #Add you source account's access key here
    aws_secret_access_key=source_aws_secret_access_key) #Add you source account's secret key here

dynamotargetclient = boto3.client('dynamodb', region_name='eu-west-3', #Specify the region here
    aws_access_key_id=destination_aws_access_key_id, #Add you destination account's access key here
    aws_secret_access_key=destination_aws_secret_access_key) #Add you destination account's secret key here
# response = dynamotargetclient.list_tables()
# print("List of tables", response)

dynamopaginator = dynamoclient.get_paginator('scan')

def validateTables(sourceTable, destinationTable):
    print("Inside validateTables")
    try:
        dynamoclient.describe_table(TableName=sourceTable)
        sourceTableExists = "true"
    except dynamotargetclient.exceptions.ResourceNotFoundException:
        sourceTableExists = "false"


    try:
        dynamotargetclient.describe_table(TableName=destinationTable)
        destinationTableExists = "true"
    except dynamotargetclient.exceptions.ResourceNotFoundException:
        destinationTableExists = "false"
    
    return {'sourceTableExists': sourceTableExists, 'destinationTableExists':destinationTableExists}        



def copyTable(sourceTable, destinationTable,item_count,counter):
    
    print("Inside copyTable")
    print("Coping", sourceTable, "to", destinationTable)

    print('Start Reading the Source Table')
    try:
            dynamoresponse = dynamopaginator.paginate(
            TableName=sourceTable,
            Select='ALL_ATTRIBUTES',
            ReturnConsumedCapacity='NONE',
            ConsistentRead=True
        )
    except dynamotargetclient.exceptions.ResourceNotFoundException:
        print("Table does not exist")
        print("Exiting")
        sys.exit()

    print('Finished Reading the Table')
    print('Proceed with writing to the Destination Table')
    print("Writing first", item_count , "items" )
    print(dynamoresponse)
    for page in dynamoresponse:
        for item in page['Items']:
            if (counter ==  item_count):
                print("exiting")
                sys.exit()
            else:      
                print('writing item no', counter)
                dynamotargetclient.put_item(
                    TableName=destinationTable,
                    Item=item
                    )   
            counter = counter + 1

def backupTable(destTableName, backupTimeStamp):
    print("Inside backupTable")
    print("Taking backup of = ", destTableName)
    print("Backup Name = ", backupTimeStamp)

    response = dynamotargetclient.create_backup(
        TableName=destTableName,
        BackupName=backupTimeStamp
    )
    print("Backup ARN =", response["BackupDetails"]["BackupArn"])

def deleteDestinationTable(destTableName):
    print("Inside deleteDestinationTable")
    try:
        dynamotargetclient.delete_table(TableName=destTableName)
        waiter = dynamotargetclient.get_waiter('table_not_exists')
        waiter.wait(TableName=destTableName)
        print("Table deleted")
    except dynamotargetclient.exceptions.ResourceNotFoundException:
        print("Table does not exist")


def doesNotExist():
    print("Inside doesNotExist")
    print("Destination table does not exist ")
    print("Exiting the execution")
    # sys.exit()

def createDestinationTable(sourceTable):
    print("Inside createDestinationTable")
    source_table = source_session.resource('dynamodb').Table(sourceTable)

    target_table = target_dynamodb.create_table(
    TableName=destinationTableName,
    KeySchema=source_table.key_schema,
    AttributeDefinitions=source_table.attribute_definitions,
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    })

    target_table.wait_until_exists()
    target_table.reload()


result = validateTables(sourceTableName, destinationTableName)
print("value of sourceTableExists = ", result['sourceTableExists'])
print("value of destinationTableExists = ", result['destinationTableExists'])

if (result['sourceTableExists'] == "false" ) and (result['destinationTableExists'] == "false" ):
    print("Both the tables do not exist")

elif (result['sourceTableExists'] == "false" ) and (result['destinationTableExists'] == "true" ):
    print("Source Table does not exist")

elif (result['sourceTableExists'] == "true" ) and (result['destinationTableExists'] == "false" ):
    createDestinationTable(sourceTableName)
    copyTable(sourceTableName, destinationTableName, item_count, counter)

elif (result['sourceTableExists'] == "true" ) and (result['destinationTableExists'] == "true" ):
    backupTable(destinationTableName, backupName)
    deleteDestinationTable(destinationTableName)

    createDestinationTable(sourceTableName)
    copyTable(sourceTableName, destinationTableName, item_count, counter)

else:
    print("Something is wrong")

Syntax: 

python copy-dynamodb-table.py -sa <source-account-access-key-here> -ss <source-account-secret-key-here> -da <destination-account-access-key-here> -ds <destination-account-secret-key-here> -st <source-table-name-here> -dt <destination-table-name-here>

Execute the Script.

You can refer to the above syntax and pass the arguments to the script. 

Command:

python copy-dynamodb-table.py -sa AKI12345IA5XJXFLMTQR -ss ihiHd8+NzLJ567890z4i6EwcN6hbV2A5cMfurscg -da AKI12345IA5XJXFLMTQR -ds  ihiHd8+NzLJ567890z4i6EwcN6hbV2A5cMfurscg -st my-source-table -dt my-destination-table

Here,

  • -sa =Source AWS Account Access key = AKIAQ6GAIA5XJXFLMTQR
  • -ss = Source AWS Account Secret key = ihiHd8+NzLJK5DFfTz4i6EwcN6hbV2A5cMfurscg
  • -da = Destination AWS Account Access key = AKIAQ6GAIA5XJXFLMTQR
  • -ds = Destination AWS Account Secret key = ihiHd8+NzLJK5DFfTz4i6EwcN6hbV2A5cMfurscg
  • -st = Source Table = my-source-table
  • -dt = Destination Table = my-destination-table

You must use your keys, the keys here belong to me. 

The script covers 4 different use-cases

  1. Use-case 1: Both the tables, Source and Destination, do not exist.
  2. Use-case 2: Source table does not exist but Destination table exists.
  3. Use-case 3: Source table exists but Destination table does not exist.
  4. Use-case 4: Both the tables, Source and Destination, exist.

Let's see these use-cases one by one.

Use-case 1: Both the tables, Source and Destination, do not exist.

If you do not have DynamoDB tables in your account and still try to execute the script, the script will exit with the "Both the tables do not exist" message. 

Both the tables do not exist

Use-case 2: Source table does not exist but Destination table exists.

If you try to pass the table which does not exist as the source table, the script will exit with the "Source Table does not exist" message.

Source table does not exist

Use-case 3: Source table exists but Destination table does not exist.

In both the above two use-cases, no operation is performed. Now, if you pass the source table that exists but the destination table does not exist, the script will create a table with the name you specify as the destination table and copy items from the source table to the newly created destination table.

Destination table does not exist

Use-case 4: Both the tables, Source and Destination, exist.

In this scenario, a backup of the destination table is taken before copying items from the source table and then the destination table is deleted. After the table is deleted, a new table with the name that you specify in the destination parameter is created and then the items from the source table are copied to the newly created destination table. 

Both the tables

Both the tables exist

Backup of destination table before coping new items

Conclusion

In this article, we saw the Python script to copy items from one DynamoDB table to another DynamoDB Table. The script covers four different use-cases that can arise while copying items from one table to another. You can now use this script to copy items from one DynamoDB table to another in the same or different AWS account.

Share this page:

Suggested articles

0 Comment(s)

Add comment