Steps to reproduce the behavior
Case 1: CTAS with explicit CAST(... AS VARCHAR(n))
Create a table with CTAS:
CREATE TABLE sandbox.test_varchar_type AS
SELECT CAST('abc' AS VARCHAR(64)) AS c;
Check the resulting column type:
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'test_varchar_type'
AND table_schema = 'sandbox';
Case 2: CTAS from a source table that already has VARCHAR(n)
Create a source table:
CREATE TABLE sandbox.src_varchar (
c VARCHAR(64)
);
Create another table from it via CTAS:
CREATE TABLE sandbox.test_varchar_from_table AS
SELECT c
FROM sandbox.src_varchar;
Check the resulting column type:
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'test_varchar_from_table'
AND table_schema = 'sandbox';
Additional check
The following parameters do not change the behavior in any combination:
enable_reduce_cast_varchar_expr_sync_type
enable_reduce_cast_varchar_length_inheritance
Expected behavior
CTAS should preserve the declared VARCHAR(n) length from the output schema of the SELECT.
Expected examples:
- CAST('abc' AS VARCHAR(64)) should create a column with type VARCHAR(64)
- selecting a column defined as VARCHAR(64) from an existing table should create a column with type VARCHAR(64)
In general, if the projected column type is VARCHAR(n), then the created table column should also be VARCHAR(n).
Real behavior
In both scenarios, CTAS creates the resulting column as:
Instead of preserving the original varchar length, StarRocks widens the type to the maximum varchar length.
Example values from INFORMATION_SCHEMA.COLUMNS:
- DATA_TYPE = varchar
- CHARACTER_MAXIMUM_LENGTH = 1048576
- COLUMN_TYPE = varchar(1048576)
This is a major issue because it effectively blocks normal usage of dbt with StarRocks in common workflows.
dbt relies heavily on CTAS to create temporary and intermediate tables. If every varchar column is expanded to VARCHAR(1048576), then generated schemas no longer match intended model definitions, making dbt-based development and production usage problematic.
StarRocks version
Reproduced on:
3.5.17
3.5.15
Steps to reproduce the behavior
Case 1: CTAS with explicit
CAST(... AS VARCHAR(n))Create a table with CTAS:
Check the resulting column type:
Case 2: CTAS from a source table that already has VARCHAR(n)
Create a source table:
Create another table from it via CTAS:
Check the resulting column type:
Additional check
The following parameters do not change the behavior in any combination:
Expected behavior
CTAS should preserve the declared VARCHAR(n) length from the output schema of the SELECT.
Expected examples:
In general, if the projected column type is VARCHAR(n), then the created table column should also be VARCHAR(n).
Real behavior
In both scenarios, CTAS creates the resulting column as:
Instead of preserving the original varchar length, StarRocks widens the type to the maximum varchar length.
Example values from INFORMATION_SCHEMA.COLUMNS:
This is a major issue because it effectively blocks normal usage of dbt with StarRocks in common workflows.
dbt relies heavily on CTAS to create temporary and intermediate tables. If every varchar column is expanded to VARCHAR(1048576), then generated schemas no longer match intended model definitions, making dbt-based development and production usage problematic.
StarRocks version
Reproduced on:
3.5.17
3.5.15