Skip to content

Conversation

Adist319
Copy link
Contributor

@Adist319 Adist319 commented Jun 6, 2025

Summary

This PR addresses Issue #409 by ensuring that InitVar fields are not treated as instance attributes in dataclasses.

fixes #409

Changes

  • Introduced a new method is_init_var in both ClassField and Annotation to determine whether a field is an InitVar.
  • Updated get_instance_attribute to prevent access to InitVar fields as instance attributes.
  • Added a test case to verify that InitVar fields are correctly excluded from instance attribute access.

…ation` to determine if a field is an InitVar. Updated `get_instance_attribute` to prevent access to InitVar fields as instance attributes. Added a test case to verify that InitVar fields are not accessible as instance attributes.
@facebook-github-bot
Copy link
Contributor

@srchilukoori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@srchilukoori
Copy link

@Adist319 Internal test failed due to conformance output is out of date, could you run the ./test.py and include the results in the PR please.

@yangdanny97
Copy link
Contributor

Hmm, based on the docs I think we shouldn't be counting it as a field at all - blocking it only when accessing it as an instance attr can cause it to leak in other places. For example, accessing it as a class var:

from dataclasses import dataclass, field, InitVar
@dataclass
class InitVarTest:
    mode: InitVar[str] = "foo"
    
InitVarTest.mode

Would it be possible to make it not create a ClassField at all? Or have it be omitted from get_class_member except in cases where we're generating the init/postinit params. I'm not sure how easy/difficult it would be so might need to try it out and see.

I would also expect this to generate a conformance test change - does test.py generate anything?

Copy link
Contributor

@yangdanny97 yangdanny97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! back to you w/ some comments on the overall approach

@Adist319
Copy link
Contributor Author

Adist319 commented Jun 7, 2025

Thanks for the comments @yangdanny97 will look into this.

@Adist319
Copy link
Contributor Author

Adist319 commented Jun 7, 2025


InitVar Filtering Regressions

I currently have changes where I refactored the attribute access methods (get_instance_attribute and get_class_attribute) to use a helper for filtering InitVar fields and added test cases. The InitVar filtering works correctly, but it's causing regressions in the conformance test output.

What's happening

The dataclass tests are failing with errors like "Expected 0 positional arguments, got 1 in function object.__init__". This pattern shows up across multiple dataclass files that don't interface with InitVar and weren't showing up before:

  • dataclasses_hash.py: 5 errors
  • dataclasses_kwonly.py: missing/unexpected argument errors
  • dataclasses_order.py: similar constructor issues

It looks like the type checker is falling back to object.__init__() instead of the proper dataclass-generated constructor.

The problem (from my understanding, correct me if I'm wrong)

get_instance_attribute() is used for both:

  1. Static analysis (dataclass constructor generation) where it needs InitVar fields
  2. Runtime attribute access where InitVar fields should be filtered out

When I filter InitVar fields, the dataclass generation can't access the field info it needs anymore.

Question

What's the cleanest way to handle this? Should I create separate methods for these two use cases, or is there a better approach I'm missing?


@yangdanny97
Copy link
Contributor

@Adist319

I see what you mean. Maybe blocking it from get_instance_attribute/get_class_attribute is enough.

We could add a special case DataclassMember::InitVar to the return type of

pub fn get_dataclass_member(&self, cls: &Class, name: &Name, kw_only: bool) -> DataclassMember {
and update the dataclass field synthesis in dataclass.rs to filter them out or use them where appropriate.

@Adist319
Copy link
Contributor Author

Adist319 commented Jun 10, 2025

@yangdanny97 I think this could work, let me try it out. Thanks for the feedback.

@Adist319
Copy link
Contributor Author

I've implemented your suggestions, @yangdanny97, but I keep hitting the same regressions, which makes me think there’s a deeper architectural issue I’m not seeing.

Core problem

Here's my current understanding and what I tried doing in addition:

It looks like the dataclass __init__ synthesis is triggering the same attribute resolution path (Expr::Attribute -> attr_infer_for_type) that runtime access uses.

That creates a Catch-22:

  • To fix runtime access, I need to filter out InitVars in attr_infer_for_type.
  • But doing that breaks the __init__ generator, since it can’t see the InitVars it needs to build the constructor. That brings back the same regressions from my earlier comment.

I tried restoring get_instance_attribute and moving the filter logic up in attr.rs and expr.rs, but the circular dependency remains.

What I’m trying to figure out

Is there a clean way to pass context down through the solver so the attribute resolver knows whether it's being called during __init__ generation vs. runtime?

Feels like I need a scoped flag or mode that tells Expr::Attribute not to filter InitVars in the constructor case — but I’m unsure how to pass that down without doing something hacky or breaking other logic.

Would appreciate any guidance on how this is supposed to be handled in the current architecture.

@yangdanny97
Copy link
Contributor

yangdanny97 commented Jun 14, 2025

Hmm, that's unexpected. I thought it only uses get_class_member under the hood. I'll pull your branch and poke around with it, will get back to you on this early next week. Thanks for your patience & for looking into this!

@facebook-github-bot
Copy link
Contributor

@yangdanny97 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@yangdanny97
Copy link
Contributor

yangdanny97 commented Jun 16, 2025

Ok, I think I got something working with this. The tricky part was getting get_class_member to return InitVars for dataclasses, but not for anything else; I ended up refactoring the implementation into a helper function and making get_dataclass_member call that instead of calling get_class_member directly. The change has already been applied to the imported version of this PR, so if it looks good to you we can merge it. Thanks for working on this!

The diff is as follows:

diff --git a/pyrefly/conformance/third_party/conformance.exp b/pyrefly/conformance/third_party/conformance.exp
--- a/pyrefly/conformance/third_party/conformance.exp
+++ b/pyrefly/conformance/third_party/conformance.exp
@@ -3379,7 +3379,28 @@
       "stop_line": 50
     }
   ],
-  "dataclasses_postinit.py": [],
+  "dataclasses_postinit.py": [
+    {
+      "code": -2,
+      "column": 7,
+      "concise_description": "Object of class `DC1` has no attribute `x`",
+      "description": "Object of class `DC1` has no attribute `x`",
+      "line": 28,
+      "name": "missing-attribute",
+      "stop_column": 12,
+      "stop_line": 28
+    },
+    {
+      "code": -2,
+      "column": 7,
+      "concise_description": "Object of class `DC1` has no attribute `y`",
+      "description": "Object of class `DC1` has no attribute `y`",
+      "line": 29,
+      "name": "missing-attribute",
+      "stop_column": 12,
+      "stop_line": 29
+    }
+  ],
   "dataclasses_slots.py": [
     {
       "code": -2,
diff --git a/pyrefly/conformance/third_party/conformance.result b/pyrefly/conformance/third_party/conformance.result
--- a/pyrefly/conformance/third_party/conformance.result
+++ b/pyrefly/conformance/third_party/conformance.result
@@ -152,8 +152,6 @@
   "dataclasses_order.py": [],
   "dataclasses_postinit.py": [
     "Line 19: Expected 1 errors",
-    "Line 28: Expected 1 errors",
-    "Line 29: Expected 1 errors",
     "Line 36: Expected 1 errors"
   ],
   "dataclasses_slots.py": [
diff --git a/pyrefly/conformance/third_party/results.json b/pyrefly/conformance/third_party/results.json
--- a/pyrefly/conformance/third_party/results.json
+++ b/pyrefly/conformance/third_party/results.json
@@ -3,7 +3,7 @@
   "pass": 58,
   "fail": 78,
   "pass_rate": 0.43,
-  "differences": 384,
+  "differences": 382,
   "passing": [
     "aliases_explicit.py",
     "aliases_newtype.py",
@@ -81,7 +81,7 @@
     "dataclasses_descriptors.py": 1,
     "dataclasses_final.py": 3,
     "dataclasses_kwonly.py": 5,
-    "dataclasses_postinit.py": 4,
+    "dataclasses_postinit.py": 2,
     "dataclasses_slots.py": 5,
     "dataclasses_transform_class.py": 10,
     "dataclasses_transform_converter.py": 7,
diff --git a/pyrefly/pyrefly/lib/alt/class/class_field.rs b/pyrefly/pyrefly/lib/alt/class/class_field.rs
--- a/pyrefly/pyrefly/lib/alt/class/class_field.rs
+++ b/pyrefly/pyrefly/lib/alt/class/class_field.rs
@@ -344,6 +344,14 @@
         }
     }
 
+    pub fn is_init_var(&self) -> bool {
+        match &self.0 {
+            ClassFieldInner::Simple { annotation, .. } => {
+                annotation.as_ref().is_some_and(|ann| ann.is_init_var())
+            }
+        }
+    }
+
     pub fn is_final(&self) -> bool {
         match &self.0 {
             ClassFieldInner::Simple { annotation, ty, .. } => {
@@ -530,11 +538,12 @@
     }
 }
 
-#[expect(clippy::large_enum_variant)] // the vast majority of `DataclassMember`s are `Field`
 /// The result of processing a raw dataclass member (any annotated assignment in its body).
 pub enum DataclassMember {
     /// A dataclass field
     Field(ClassField, BoolKeywords),
+    /// A pseudo-field that only appears as a constructor argument
+    InitVar(ClassField),
     /// A pseudo-field annotated with KW_ONLY
     KwOnlyMarker,
     /// Anything else
@@ -834,7 +843,7 @@
             .ancestors(self.stdlib)
             .find_map(|parent| {
                 let parent_field =
-                    self.get_field_from_current_class_only(parent.class_object(), name)?;
+                    self.get_field_from_current_class_only(parent.class_object(), name, true)?;
                 found_field = true;
                 let ClassField(ClassFieldInner::Simple { annotation, .. }) = &*parent_field;
                 annotation.clone()
@@ -896,7 +905,7 @@
     pub fn get_dataclass_member(&self, cls: &Class, name: &Name, kw_only: bool) -> DataclassMember {
         // Even though we check that the class member exists before calling this function,
         // it can be None if the class has an invalid MRO.
-        let Some(member) = self.get_class_member(cls, name) else {
+        let Some(member) = self.get_class_member_impl(cls, name, true) else {
             return DataclassMember::NotAField;
         };
         let field = &*member.value;
@@ -912,6 +921,8 @@
                     .is_some_and(|annot| annot.has_qualifier(&Qualifier::ClassVar)))
         {
             DataclassMember::NotAField // Class variables are not dataclass fields
+        } else if field.is_init_var() {
+            DataclassMember::InitVar(field.clone())
         } else {
             DataclassMember::Field(field.clone(), field.dataclass_flags_of(kw_only))
         }
@@ -927,7 +938,7 @@
         errors: &ErrorCollector,
     ) -> Type {
         if let Some(method_field) =
-            self.get_non_synthesized_field_from_current_class_only(class, method_name)
+            self.get_non_synthesized_field_from_current_class_only(class, method_name, false)
         {
             match &method_field.raw_type() {
                 Type::Forall(box Forall { tparams, .. }) => {
@@ -1221,10 +1232,15 @@
         &self,
         cls: &Class,
         name: &Name,
+        include_initvar: bool,
     ) -> Option<Arc<ClassField>> {
         if cls.contains(name) {
             let field = self.get_from_class(cls, &KeyClassField(cls.index(), name.clone()));
-            Some(field)
+            if !include_initvar && field.is_init_var() {
+                None
+            } else {
+                Some(field)
+            }
         } else {
             None
         }
@@ -1235,8 +1251,11 @@
         &self,
         cls: &Class,
         name: &Name,
+        include_initvar: bool,
     ) -> Option<Arc<ClassField>> {
-        if let Some(field) = self.get_non_synthesized_field_from_current_class_only(cls, name) {
+        if let Some(field) =
+            self.get_non_synthesized_field_from_current_class_only(cls, name, include_initvar)
+        {
             Some(field)
         } else {
             let synthesized_fields =
@@ -1246,12 +1265,13 @@
         }
     }
 
-    pub(in crate::alt::class) fn get_class_member(
+    fn get_class_member_impl(
         &self,
         cls: &Class,
         name: &Name,
+        include_initvar: bool,
     ) -> Option<WithDefiningClass<Arc<ClassField>>> {
-        if let Some(field) = self.get_field_from_current_class_only(cls, name) {
+        if let Some(field) = self.get_field_from_current_class_only(cls, name, include_initvar) {
             Some(WithDefiningClass {
                 value: field,
                 defining_class: cls.dupe(),
@@ -1260,15 +1280,27 @@
             self.get_metadata_for_class(cls)
                 .ancestors(self.stdlib)
                 .find_map(|ancestor| {
-                    self.get_field_from_current_class_only(ancestor.class_object(), name)
-                        .map(|field| WithDefiningClass {
-                            value: Arc::new(field.instantiate_for(&Instance::of_class(ancestor))),
-                            defining_class: ancestor.class_object().dupe(),
-                        })
+                    self.get_field_from_current_class_only(
+                        ancestor.class_object(),
+                        name,
+                        include_initvar,
+                    )
+                    .map(|field| WithDefiningClass {
+                        value: Arc::new(field.instantiate_for(&Instance::of_class(ancestor))),
+                        defining_class: ancestor.class_object().dupe(),
+                    })
                 })
         }
     }
 
+    pub(in crate::alt::class) fn get_class_member(
+        &self,
+        cls: &Class,
+        name: &Name,
+    ) -> Option<WithDefiningClass<Arc<ClassField>>> {
+        self.get_class_member_impl(cls, name, false)
+    }
+
     pub fn get_instance_attribute(&self, cls: &ClassType, name: &Name) -> Option<Attribute> {
         self.get_class_member(cls.class_object(), name)
             .map(|member| {
@@ -1325,7 +1357,7 @@
             .skip_while(|ancestor| *ancestor != start_lookup_cls);
         for ancestor in ancestors {
             if let Some(found) = self
-                .get_field_from_current_class_only(ancestor.class_object(), name)
+                .get_field_from_current_class_only(ancestor.class_object(), name, false)
                 .map(|field| WithDefiningClass {
                     value: Arc::new(field.instantiate_for(&Instance::of_class(ancestor))),
                     defining_class: ancestor.class_object().dupe(),
diff --git a/pyrefly/pyrefly/lib/alt/class/dataclass.rs b/pyrefly/pyrefly/lib/alt/class/dataclass.rs
--- a/pyrefly/pyrefly/lib/alt/class/dataclass.rs
+++ b/pyrefly/pyrefly/lib/alt/class/dataclass.rs
@@ -108,6 +108,7 @@
         &self,
         cls: &Class,
         fields: &SmallSet<Name>,
+        include_initvar: bool,
     ) -> Vec<(Name, ClassField, BoolKeywords)> {
         let mut kw_only = false;
         fields
@@ -119,6 +120,13 @@
                 }
                 DataclassMember::NotAField => None,
                 DataclassMember::Field(field, keywords) => Some((name.clone(), field, keywords)),
+                DataclassMember::InitVar(field) => {
+                    if include_initvar {
+                        Some((name.clone(), field, BoolKeywords::new()))
+                    } else {
+                        None
+                    }
+                }
             })
             .collect()
     }
@@ -131,7 +139,7 @@
         kw_only: bool,
     ) -> ClassSynthesizedField {
         let mut params = vec![self.class_self_param(cls, false)];
-        for (name, field, field_flags) in self.iter_fields(cls, fields) {
+        for (name, field, field_flags) in self.iter_fields(cls, fields, true) {
             if field_flags.is_set(&DataclassKeywords::INIT) {
                 params.push(field.as_param(
                     &name,
@@ -161,7 +169,7 @@
         let ts = if kw_only {
             Vec::new()
         } else {
-            let filtered_fields = self.iter_fields(cls, fields);
+            let filtered_fields = self.iter_fields(cls, fields, false);
             filtered_fields
                 .iter()
                 .filter_map(|(name, _, field_flags)| {
diff --git a/pyrefly/pyrefly/lib/alt/class/enums.rs b/pyrefly/pyrefly/lib/alt/class/enums.rs
--- a/pyrefly/pyrefly/lib/alt/class/enums.rs
+++ b/pyrefly/pyrefly/lib/alt/class/enums.rs
@@ -19,7 +19,7 @@
 
 impl<'a, Ans: LookupAnswer> AnswersSolver<'a, Ans> {
     pub fn get_enum_member(&self, cls: &Class, name: &Name) -> Option<Lit> {
-        self.get_field_from_current_class_only(cls, name)
+        self.get_field_from_current_class_only(cls, name, false)
             .and_then(|field| Arc::unwrap_or_clone(field).as_enum_member(cls))
     }
 
diff --git a/pyrefly/pyrefly/lib/test/dataclasses.rs b/pyrefly/pyrefly/lib/test/dataclasses.rs
--- a/pyrefly/pyrefly/lib/test/dataclasses.rs
+++ b/pyrefly/pyrefly/lib/test/dataclasses.rs
@@ -657,3 +657,26 @@
     y: str
     "#,
 );
+
+testcase!(
+    test_initvar_not_stored_as_attributes,
+    r#"
+from dataclasses import dataclass, field, InitVar
+@dataclass
+class InitVarTest:
+    value: int = field(init=False)
+    mode: InitVar[str]
+    count: InitVar[int]
+    def __post_init__(self, mode: str, count: int):
+        if mode == "number":
+            self.value = count * 10
+        else:
+            self.value = 0
+instance = InitVarTest("number", 5)
+# InitVar fields should not be accessible as instance attributes
+instance.mode  # E: Object of class `InitVarTest` has no attribute `mode`
+instance.count  # E: Object of class `InitVarTest` has no attribute `count`
+# Regular fields should be accessible
+instance.value  # OK
+    "#,
+);
diff --git a/pyrefly/pyrefly/lib/types/annotation.rs b/pyrefly/pyrefly/lib/types/annotation.rs
--- a/pyrefly/pyrefly/lib/types/annotation.rs
+++ b/pyrefly/pyrefly/lib/types/annotation.rs
@@ -62,6 +62,10 @@
         self.has_qualifier(&Qualifier::Final)
     }
 
+    pub fn is_init_var(&self) -> bool {
+        self.has_qualifier(&Qualifier::InitVar)
+    }
+
     pub fn has_qualifier(&self, qualifier: &Qualifier) -> bool {
         self.qualifiers.iter().any(|q| q == qualifier)
     }

Copy link
Contributor

@rchen152 rchen152 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review automatically exported from Phabricator review in Meta.

@facebook-github-bot
Copy link
Contributor

@yangdanny97 merged this pull request in f32c44e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature] InitVars should not count as dataclass fields

5 participants