![]() Then we can use the selected_columns to construct the query we want: columns_str = ",".join(selected_columns) # Be sure to consider what to do if the selected_columns is an empty list query_str = "SELECT FROM. from google.cloud import bigquery bigquery_client = bigquery.Client() table = bigquery_client.get_table('. Looking at the documentation, we can access the schema of a table using Table.schema and the columns are just a list of schema_field s. So maybe the Table API can help with this. What if we don’t want to use two queries? Then we can filter on the columns variable. ![]() The keys() method returns the keys for using a row as a dict. The first query is to get a random row from the table, gather all the column names from the result set and then filter those that contain the pattern/substring: from google.cloud import bigquery bigquery_client = bigquery.Client() query_str = "SELECT * FROM LIMIT 1 " query_job = bigquery_client.query(query_str) result = query_job.result() for row in result: columns = row.keys() One possible solution (untested) is to use two queries. Assuming we don’t have the access, is there anything we can do to extract the column names we need? We need to have permission to access the the INFORMATION_SCHEMA.COLUMNS table and the admin may not always want to grant such access. The desired output is now in the data variable. ![]() limit 10", columns) """ query_job = bigquery_client.query(query_string) data = query_job.result() INFORMATION_SCHEMA.COLUMNS WHERE table_name = AND REGEXP_CONTAINS(column_name, ) ) SELECT STRING_AGG(column_name) AS columns FROM selected_columns ) EXECUTE IMMEDIATE format("SELECT id, %s FROM. from google.cloud import bigquery bigquery_client = bigquery.Client() query_string = """ SET columns = ( WITH selected_columns as ( SELECT column_name FROM. So in your code (assuming we are using Python), we can define a variable called query_string to represent the whole query and execute the query using the BigQuery client. EXECUTE IMMEDIATE is the command we can use to executes a dynamic SQL on the fly. In the query above, the %s is replaced by the actual value of the columns variable, which looks like column1,column2,column3,column4.,columnN. Now that we have the dynamically generated columns in the columns variable, we can construct the final query we want and execute it to get the results: EXECUTE IMMEDIATE format("SELECT id, %s FROM. INFORMATION_SCHEMA.COLUMNS WHERE table_name = AND REGEXP_CONTAINS(column_name, ) ) SELECT STRING_AGG(column_name) AS columns FROM selected_columns ) SET columns = ( WITH selected_columns as ( SELECT column_name FROM. We can do this by putting the query above into a common table expression (CTE) using WITH and assign the aggregated value from STRING_AGG to the columns variable using SET. However, we need to convert it into a string so we can assign it to the columns variable. Great! This returns a list of column names that contains the user-defined pattern. INFORMATION_SCHEMA.COLUMNS WHERE table_name = AND REGEXP_CONTAINS(column_name, ) Therefore, we can get them by: SELECT column_name FROM. The info about the column names is part of the INFORMATION_SCHEMA.COLUMNS table. Then we need to figure out the value of the variable. We can declare a string variable to represent the columns matching the pattern. This fits into the use case of scripting to “declare a variable, assign a value to it, and then reference it in a third statement” as stated in the documentation. Specifically, we need to construct the columns based on the user input. BigQuery provides a way to execute this type of query through scripting. Since the user-defined pattern can vary, the SQL query itself becomes dynamic in nature. So how do we solve it? In this post, we will look at two solutions: However, this does not work for the column names. SELECT * FROM WHERE name LIKE "%Abc" limit 10 If we are querying the actual data that matches with a pattern, we can use the LIKE operator. In other words, it’s something like this: SELECT columns contain this pattern FROM Recently I’m dealing with a situation where the requests ask for data from columns that contain a user-defined substring in the column name in our BigQuery table. How to select data from columns contain a substring from BigQuery
0 Comments
Leave a Reply. |