amazon web services - How to increase performance of large number of updates to a redshift table with python functions -
i have large redshift table around 200 million records. update values in 1 of columns using user-defined python function. if run function in ec2 instance, results in millions of updates table, , slow. there better process me speed these updates?
unlike row-based systems, ideal transaction processing, column-based systems (redshift) ideal data warehousing , analytics, queries involve aggregates performed on large data sets. since columns involved in queries processed , columnar data stored sequentially on storage media, column-based systems require far fewer i/os, improving query performance.
in example instead of doing multiple separate updatecommands can perform single update .. set.. ... ....
Comments
Post a Comment