FEAT: Adding setinputsizes #192

jahnvi480 · 2025-08-21T15:37:00Z

Work Item / Issue Reference

AB#32890

Summary

This pull request adds support for the setinputsizes method to the mssql_python DB-API cursor, allowing users to explicitly specify SQL parameter types and sizes for queries. This enhancement improves parameter binding control, especially for batch operations and cases where automatic type inference may be insufficient. The changes also ensure that input size specifications are reset after each execution, and comprehensive tests are included to verify the new functionality and its integration with both execute and executemany.

New feature: Explicit parameter typing with setinputsizes

Added a setinputsizes method to the cursor class, enabling users to declare SQL types, sizes, and decimal digits for query parameters. This method stores the input sizes and provides detailed documentation and usage examples. (mssql_python/cursor.py)
Implemented logic in parameter binding to use explicitly set input sizes when available, falling back to automatic type inference otherwise. This applies to both single and batch executions. (mssql_python/cursor.py)

Robustness and reset behavior

Ensured that input size specifications are automatically reset after each call to execute or executemany, preventing unintended reuse across statements. (mssql_python/cursor.py)

Testing and validation

Added comprehensive tests to cover basic usage, batch inserts with floats, reset behavior, and explicit override of type inference using setinputsizes. These tests verify correct parameter binding, data insertion, and reset semantics. (tests/test_004_cursor.py)

Internal improvements

Introduced helper methods for mapping SQL types to C types and for resetting input sizes, improving code clarity and maintainability. (mssql_python/cursor.py)

These changes provide more reliable and predictable parameter binding for users, especially in complex or high-performance scenarios.

sumitmsft · 2025-08-25T05:42:05Z

mssql_python/cursor.py

@@ -463,6 +464,71 @@ def _check_closed(self):
        if self.closed:
            raise Exception("Operation cannot be performed: the cursor is closed.")

+    def setinputsizes(self, sizes):


please consider adding type annotations to new methods such as setinputsizes, _reset_inputsizes, and _get_c_type_for_sql_type. Type annotations will improve code clarity, enable better static analysis, and make the codebase more maintainable as it grows.

sumitmsft · 2025-08-25T05:48:24Z

mssql_python/cursor.py

-        )
+
+        # Check if we have explicit type information from setinputsizes
+        if hasattr(self, '_inputsizes') and self._inputsizes and i < len(self._inputsizes):


There are places where you check if the object (self) has an attribute called _inputsizes using hasattr(self, '_inputsizes').
The reviewer noticed that the _inputsizes attribute is always created (initialized) when the object is constructed (in the class’s __init__ method).
If an attribute is always present (because it’s defined in the constructor), you don’t need to check if it exists every time you use it.
(It will always exist, unless something very unusual happens in your code.)
These hasattr checks are, therefore, unnecessary ("redundant").
Removing them will make your code cleaner, easier to read, and easier to maintain.

sumitmsft · 2025-08-25T05:59:06Z

mssql_python/cursor.py

-        sql_type, c_type, column_size, decimal_digits = self._map_sql_type(
-            parameter, parameters_list, i
-        )
+


Please double-check that all parameterized queries remain fully protected against SQL injection—even when input sizes or types are set by users via setinputsizes. It's important to ensure that user-supplied values for input sizes/types cannot be used to inject malicious SQL or bypass query parameterization. If possible, add validation or sanitization where needed, and consider adding a test case for this scenario.

bewithgaurav · 2025-08-25T05:59:54Z

tests/test_004_cursor.py

+
+    # Set input sizes for parameters
+    cursor.setinputsizes([
+        (ConstantsDDBC.SQL_WVARCHAR.value, 100, 0),


lets refine usage of constants a bit more, we should probably export them to them module level
example usage from pyodbc

crsr.setinputsizes([(pyodbc.SQL_WVARCHAR, 50, 0), (pyodbc.SQL_DECIMAL, 18, 4)])

we can probably go for something like mssql_python.SQL_WVARCHAR
can be a separate task since the usage is end user facing

sumitmsft · 2025-08-25T06:31:20Z

mssql_python/cursor.py

            except Exception as e:
                log('warning', f"Failed to set query timeout: {e}")

        param_info = ddbc_bindings.ParamInfo
        param_count = len(seq_of_parameters[0])
        parameters_type = []
+
+        # Make a copy of the parameters for potential transformation


consider raising a warning or error if the number of input sizes set via setinputsizes does not match the number of parameters provided to executemany. This will help catch user mistakes early and prevent subtle bugs due to mismatched parameter and input size definitions.

sumitmsft · 2025-08-25T06:44:30Z

mssql_python/cursor.py

            except Exception as e:
                log('warning', f"Failed to set query timeout: {e}")

        param_info = ddbc_bindings.ParamInfo
        param_count = len(seq_of_parameters[0])
        parameters_type = []
+
+        # Make a copy of the parameters for potential transformation
+        processed_parameters = [list(params) for params in seq_of_parameters]


The code is using [list(params) for params in seq_of_parameters] to create a new list of lists from seq_of_parameters (which is probably a list or sequence of parameters for batch inserts).

When you do this for a very large number of rows (for example, thousands or millions), it creates a copy of every row in memory. This can use a lot of memory and might slow things down or even cause crashes if there isn’t enough memory.

If possible, don’t create a big copy of all the data at once.
Instead, you could use a generator expression (which makes one item at a time, only when needed) or change the items in place (if it’s safe to do so).

Current Implementation:

processed_parameters = [list(params) for params in seq_of_parameters]

This creates a new list in memory that contains a copy of every params as a list.
If seq_of_parameters has 1,000,000 items, Python immediately builds a list with 1,000,000 copies in memory.
This can use a lot of memory at once.

Generator Expression:

processed_parameters = (list(params) for params in seq_of_parameters)

This creates a generator—not a list. It doesn’t copy anything right away.
Each list(params) is created only when you need it (for example, when you loop over new_seq).
Much less memory is used because only one item is in memory at a time.

List comprehension is eager: makes everything up front, uses more memory.
Generator expression is lazy: makes each result only when needed, uses less memory.

sumitmsft · 2025-08-25T06:56:59Z

tests/test_004_cursor.py

@@ -1556,6 +1556,189 @@ def test_decimal_separator_calculations(cursor, db_connection):
        cursor.execute("DROP TABLE IF EXISTS #pytest_decimal_calc_test")
        db_connection.commit()

+def test_cursor_setinputsizes_basic(db_connection):


Add tests for cases where the number of input sizes does not match the number of parameters.

Add tests with None/NULL values to verify robust handling.

Add tests for all supported SQL types, including edge types (DATE, TIME, BINARY).

sumitmsft

Left a few comments...

FEAT: Adding setinputsizes

e4f56c7

github-actions bot added the pr-size: medium Moderate update size label Aug 21, 2025

sumitmsft reviewed Aug 25, 2025

View reviewed changes

bewithgaurav requested changes Aug 25, 2025

View reviewed changes

sumitmsft reviewed Aug 25, 2025

View reviewed changes

sumitmsft requested changes Aug 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT: Adding setinputsizes #192

FEAT: Adding setinputsizes #192

Uh oh!

jahnvi480 commented Aug 21, 2025 •

edited by azure-boards bot

Loading

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

bewithgaurav Aug 25, 2025

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

sumitmsft Aug 25, 2025

Uh oh!

sumitmsft left a comment

Uh oh!

Uh oh!

FEAT: Adding setinputsizes #192

Are you sure you want to change the base?

FEAT: Adding setinputsizes #192

Uh oh!

Conversation

jahnvi480 commented Aug 21, 2025 • edited by azure-boards bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Work Item / Issue Reference

Summary

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

bewithgaurav Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

sumitmsft left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jahnvi480 commented Aug 21, 2025 •

edited by azure-boards bot

Loading