-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: reflect: add DeepCopy #51520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Have you seen previous proposals like #18371? |
Unlike #18371, this proposal suggests deep copy as a package API (as a complement of |
The reflect API was already mentioned there, but in any case, it was rejected due to the functionality being proposed and not what package API it's under. |
The previous decision was made based on the interpretation of it should not be built-in. The discussion of as a reflect API seems not to converge to a consensus yet (as how I understand the conversation). Sure, the deep copy behavior can be implemented as an external package as there are many already. But as an API in reflect package, it can also receive runtime performance improvements (e.g. direct memory copy). Moreover, it is also a standard implementation to avoid potentially incorrect implementation. |
How should |
I can imagine two options for as the behavior (if I understood the question correctly):
package main
import (
"unsafe"
"reflect"
)
type A struct{ b *B }
type B struct{ a *A }
func main() {
x := &A{}
y := &B{}
x.b = y
y.a = x
println(memAddr(x), memAddr(y), memAddr(x.b), memAddr(x.b.a), memAddr(x.b.a.b)) // 1 2 2 1 2
var z *A
reflect.DeepCopy(z, x)
println(memAddr(z), memAddr(z.b), memAddr(z.b.a), memAddr(z.b.a.b)) // 3 4 3 4
}
func memAddr[T any](x *T) uintptr { return uintptr(unsafe.Pointer(x)) } I would more prefer the second case, hence propose the second option. |
I feel like there is little benefit to the output value being an argument to the function. Any, for example, fields in a struct that are interfaces, pointers, maps, etc. are going to require the function to instantiate new values from scratch anyways if it's really going to create a full copy, so how about just
|
How does it handle overlapping cyclic structures? For example: type S struct {
Next *S
}
func main() {
one := &S{}
two := &S{Next: &S{Next: &S{Next: one}}}
one.Next = two
reflect.DeepCopy(one, two)
} Obviously this would be a really weird case, but something similar could happen accidentally. If the address stored in |
I think this is one of the possible optimal versions of deep copy. The pros are as you explained and don't require further document about the status of destination. The cons might be that it always have to allocate on heap. The proposed versions have pros such as the destination might be stack allocated, hence GC friendly. The potential con could be addressed by documenting more:
Either way is acceptable, and have their unique advantages. Depends on the final decision. |
This is an interesting case and I would more prefer to panic as a misuse and not handling it. It is solved by adding this constraint as my last comment elaborated:
|
One thing that is problematic is that currently the reflect package disallows setting private (unexported) struct fields. But your proposed version of Also, there are several interesting cases beyond just To me it seems like the de facto standard |
I would argue that deep copy is not about violating setting private struct fields because one cannot use it to set a particular private field. Instead, the API duplicates one to the other completely.
These are valid cases, but also make the argument of "why a standard implementation is necessary" stronger. To my senses, I think:
I am not sure which
defining a common interface (as a Let's still assume a type set |
If the original mutex was locked, what is the state of the new mutex? If the struct being copied contained a *os.File, what is the state of the cloned *os.File? Does it point a new a file descriptor, or the original file descriptor? What happens when code was using the mutex to serialise access to that file descriptor? |
Either option is acceptable. Setting the new mutex to an unlocked state could be considered as the intuition of the newly created object is not locked; retaining the lock state fits more to have the exact memory representation. The choice between the two is a matter of decision.
This is a similar case. Depending on the choice, we could either retain the state of Briefly summarize these exceptional cases, it all relates to the problem: what should we do when handling a stateful object, such as channels, Subjectively, I would argue One may also argue that changing the state of a copied object, such as mutex, violates the duplication semantics of "copy", but we would also need to be aware that a deep copied value does not necessarily |
What I mean is that if this were to be implemented, the reflect package would have to do a lot of special casing. It cannot be implemented in terms of the existing operations. Wit regards to
The name of the method is ultimately immaterial. The important part is that it be a method, because the behavior of that method is now part of the type's contract. In particular, the returned object is actually valid/usable, whatever that means for the type in question. |
How will this Deep Copy feature of the reflect library of Golang be useful for making runtime dynamic conversion of any structured data-field into (say csv file) for audit - level entrypoint functionality. |
I still fail to see why it have to do "a lot" special casing, and how it "cannot" be implemented. I think we might be better ask: When will a developer consider to use an API for deep copy? If they have a struct with tons of net.Conn, mutexes, and os.File, how likely such a value needs a deep copy? What are the primary use cases of this API? How much they can benifit from DeepCopy? #51520 (comment) is a reasonable usecase.
Sorry, I don't think this argument is strong enough, as it was already argued that a Cloneable interface is not easily possible and already incompatible with existing usage. See #51520 (comment)
I think this is the suggested behavior based on the discussion until now (copied from the tip of the thread):
|
@changkun First, I never said it cannot be implemented. I said it cannot be implemented in terms of the existing operations, because the reflect package does not allow setting private fields. So it would have to make some internal exception when it is doing so as part of your proposed method implementation. This also means that the reflect package is the only place that such a method could be implemented.
To be honest, I don't understand what the use case from @EbenezerJesuraj is. I don't see any relationship between
It is merely a statement of fact. Either the reflect package applies a one-size-fits-all solution (meaning it always copies struct fields blindly, regardless of whether it's Frankly, I challenge the two claims you made in the original post.
Better runtime performance compared to what? If you mean compared to writing the same code outside the reflect package, then maybe. But compared to a |
I am not sure what your codebase looks like, but the wide Go user code bases certainly have the need, with evidence in https://pkg.go.dev/search?q=deep+copy&m= . Customization seems to fall out of the discussion and not the purpose of this proposal, it might be worth having another proposal.
Note that the sentence was not a claim but only communicating possibilities. This is based on the fact that the reflect has closer access to runtime itself, such as direct memory copy, collaboration with GC, etc. Discussing the actual performance in the implementation and benchmarking seems too early as the proposal is not yet accepted. |
EJ Reply: Package reflect implements run-time reflection, allowing a program to manipulate objects with arbitrary types. The typical use is to take a value with static type interface{} and extract its dynamic type information by calling TypeOf, which returns a Type. CSV's was an example scenario implementing the Deep Copy feature fo the reflect library. (Say I receive a set of string related data in a structure format, reflect libraries DeepCopy feature should be able to understnad the nature of the data-type(s) if there the structure not only holds strings but a combination of a lot of primitive data-types and our need is to store and view it in an excel sheet format (csv or tsv and so on). Consider the above scenario which is required/performed on a Golang based scaffold using an hexagonal architectural pattern) will the Deep Copy feature of reflect library naturally support this use-cae. |
Auxillary Info: We can also say that the program knows the meta-data of the different types of data present in the struct without even reading the actual structure/db |
Auxillary Info 2: csv was just a particular use-case scenario. The Storage can be of any means known by the industry. Shallow Copy will only make a pointer point to the same location which considering an endpoint-level changes happening in a business-application is highly infeasible(a project which i am working on currently). So will DeepCopy.reflect support such use-cases. |
Speaking of use cases, I can provide one example from my practice. In graphics applications, geometric data are represented by mesh consisting of vertices, edges, faces, etc. Vertices have different attributes, edges can be directional, faces can be polygon-based. Clone/Copy/Duplicate such an object is not easy and requires many developers' effort. But with DeepCopy, one can easily create a copy of a given mesh. This is very helpful for geometry processing cases where caching cascaded meshes are convenient for different kinds of solvers. This case can generalize to any graph data. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@EbenezerJesuraj Can you clarify how something like
@changkun I'm not familiar with such applications. For the kinds of structs you would be copying, do they typically have private fields you would also need to copy? |
@rittneje Hi, In a Business-grade application we may have to deal with a large set of applications dynamically at run-time. (Quick intro on Concurrency: Go's Concurrency of executing non-related independent tasks provides very good optimizations.) We may have to develop different types of applications for different scenarios and one can avoid writing code manually / avoid re-inventing the wheel by developing a generic module like Deep Copy to store the data received from Clients to the Server. If this generic module supports conversion of any data input field (say in the form of struct to any tabular or any other feasible means of storage) then DeepCopy.reflect might look like an ideal choice in my opinion. Say if we did a Shallow Copy we will have to use pointers and if they are not managed properly can bring a lot of mess into the code and might possibly expose business data to scammers and phishers, which is not preferrable. |
@changkun I don't understand your question. If an Imagine I have a library that exposes a struct like this: type Foo struct {
privateField string
} Your code might then leverage your proposed type Foo struct {
privateField string
someFile *os.File
} Now your code is dangerously broken for the aforementioned reason. If however I had a func (f *Foo) Copy() *Foo {
return &Foo{
privateField: f.privateField,
// this method doesn't exist today, but you get the idea
someFile: f.someFile.Copy(),
}
} Please note that |
Sure in this case it is not safe to use after deep copy. As warned in the doc:
But let's take one step back: I don't fully understand the motivation of your described use case too. How frequent for a struct will hold an unexported file? Or Is this really a good practice? When the file will be closed? Automatically using a finalizer? Ask the user to remember to call an exported Close method for closing the file? Either way seems to have a danger of not closing the file. If a package offers a customized copy method, should not using DeepCopy be considered as misuse in this package? |
Your doc comment shifts the burden to the wrong place. The whole point under discussion here is that some types cannot be naively copied, and a
I obviously cannot give a concrete answer here, but it is entirely within a library author's jurisdiction to introduce such changes. And keep in mind that just because the struct has a reference to the file, does not mean that it is what created it. As a contrived example: func NewFooWithFile(f *os.File) *Foo {
return &Foo{someFile: f}
} And as I mentioned,
Again, I don't see any purpose for |
Of course, the discussion was about singleton that should not be naively copied. The counter argumentation was if this is the case, why should not DeepCopy be avoided in this case, and considered as a misuse of this API? I think your argument is about "as long as an API can't design in a way that its user can't blindly throw arguments in and get an intuitive response out, we shouldn't have it." There are enough examples against this, and we have discussed them before.
Furthermore, even we have the concerning danger. One can still use a DeepCopy to save a lot of copying work without introducing any danger when offering customized Copy:
This is particularly helpful when Foo is a linked list, tree, graph, etc. |
What happens if type Bar struct {
f *Foo
...
} What does func (b *Bar) Copy() *Bar {
b2 := reflect.DeepCopy(b)
b2.f = b2.f.Copy()
} It very much seems like if I also still believe that if type Foo struct {
someField string `deepcopy:"true"`
someFile *os.File
}
func (f *Foo) Copy() *Foo {
type foo Foo
f2 := reflect.DeepCopy((*foo)(f))
f2.someFile = f.someFile.Copy()
return (*Foo)(f2)
} |
This is not true because a DeepCopy already copied f. Respect a customized Copy or equivalent, is similar to what was rejected in #20444
Then we don't need deep copy anyway, one can simply marshal and unmarshal.
I personally object to this idea. No special reason, sorry. (Perhaps you could fill a separate proposal?) |
This comment was marked as off-topic.
This comment was marked as off-topic.
No,
That proposal isn't relevant. I will also mention that personally, I've never had any use for
You can, except that it will be significantly less efficient and cannot handle circular data structures. Anyway, I feel like we're just cycling through the same points again and again, so I will leave it with this:
|
I agree. Until now, all of us are simply repeating the same topic repeatedly. All arguments have been repeated too many times, and yet no new insights. I'll stop arguing if there are no more new interesting argumentations:
You are requesting feature extensions that had been constantly rejected from the DeepEqual perspective. Referring to the third point of #51520 (comment)
This is a purely subjective opinion. If that's the case, the reflect package should not exist.
This is once again the first point. Referring to the updated proposal a few weeks ago:
If not enough, this could be:
which holds similar functionality to what was documented in Mutex.TryLock. |
For anyone who would like to try out this proposal, or argue against no actual implementation yet, I quickly prototyped an implementation here: After the implementation, I think this is the latest proposal: // DeepCopy copies src to dst recursively.
//
// Two values of identical type are deeply copied if one of the following
// cases apply.
//
// Numbers, bools, strings are deeply copied and have different underlying
// memory address.
//
// Slice and Array values are deeply copied, including its elements.
//
// Map values are deeply copied for all of its key and corresponding
// values.
//
// Pointer values are deeply copied for their pointed value, and the
// pointer points to the deeply copied value.
//
// Struct values are deeply copied for all fields, including exported
// and unexported.
//
// Interface values are deeply copied if the underlying type can be
// deeply copied.
//
// There are a few exceptions that may result in a deeply copied value not
// deeply equal (asserted by DeepEqual(dst, src)) to the source value:
//
// 1) Func values are still refer to the same function
// 2) Chan values are replaced by newly created channels
// 3) One-way Chan values (receive or read-only) values are still refer
// to the same channel
//
// Note that while correct uses of DeepCopy do exist, they are not rare.
// The use of DeepCopy often indicates the copying object does not contain
// a singleton or is never meant to be copied, such as sync.Mutex, os.File,
// net.Conn, js.Value, etc. In these cases, the copied value retains the
// memory representations of the source value but may result in unexpected
// consequences in follow-up usage, the caller should clear these values
// depending on their usage context.
func DeepCopy[T any](src T) (dst T) Thanks to everyone for previous clarifying discussions. PS. There was a large discussion on customization. A branched prototype implements a possible proposal (although this would be a different proposal): // DeepCopyOption represents an option to customize deep copied results.
type DeepCopyOption func(opt *copyConfig)
// DisallowCopyUnexported returns a DeepCopyOption that disables the behavior
// of copying unexported fields.
func DisallowCopyUnexported() DeepCopyOption
// DisallowCopyCircular returns a DeepCopyOption that disables the behavior
// of copying circular structures.
func DisallowCopyCircular() DeepCopyOption
// DisallowCopyBidirectionalChan returns a DeepCopyOption that disables
// the behavior of producing new channel when a bidirectional channel is copied.
func DisallowCopyBidirectionalChan() DeepCopyOption
// DisallowTypes returns a DeepCopyOption that disallows copying any types
// that are in given values.
func DisallowTypes(val ...any) DeepCopyOption
// DeepCopy copies src to dst recursively.
//
// (...)
//
// To change these predefined behaviors, use provided DeepCopyOption.
func DeepCopy[T any](src T, opts ...DeepCopyOption) (dst T) |
It doesn't sound like there is any consensus about adding this, which means we should probably decline this proposal. |
Based on the discussion above, this proposal seems like a likely decline. |
Frankly, it is not convincing and constructive enough to reject a proposal because there is no consensus in a discussion. The argument would be much stronger for technical reasons, such as language spec violation, implementation difficulty, etc. If there is a decision difficulty, it might be a reasonably well-balanced decision to put this on hold for a few years rather than close it right away. In this way, the issue can collect more opinions regarding the proposal. |
@changkun respectfully, I disagree that keeping this proposal open will help. If you want to continue the flow of ideas, I would once again suggest that you start a third-party package and use the issue tracker or discussion forum there. It's worked very well for go-cmp for example, which culminated into #45200 after years of effort and validation from real users. I would also like to give my two cents from the point of view of someone who has been quietly reading this thread: I'm getting the impression that you are defending this proposal rather than trying to see what the best outcome for Go would be, even if reaching that best outcome takes more experimentation outside of this repository. I understand that spending time on a proposal and having it rejected can be underwhelming, but this is part of what sets Go apart from other languages: it has kept the language and standard library small by rejecting many ideas. |
I am not really objecting to this idea and have already placed the prototyped implementation in a third location. However, on behalf of the purpose of this proposal, I remained a defending position in this thread. The "go-cmp" is another special case where the home organization account is google, which is more attractive to the public hence (I think) nicely yielding a successful example. The situation becomes different when such a location is fully a third party, which creates less impactful attraction and is considered much more untrustable. Based on my (naive) observation, I think the levels of trust and reliability decrease, from the standard package to golang.org/x, top-grade organizations, and then individual sources. Therefore it might be more helpful to gain more exposure when the association maintains. As (the majority, I think) developers love to gain their individual impact, and sometimes might even be more frustrating for them when their idea is rejected from somewhere, achieved success in a third place, then being "stolen" (emphasize: this is not the optimal wording, but sometimes being abused under this context) to the place where originally said no. This may be considered harmful for both parties at extreme cases.
Now, this is even more off-topic. Your encouraging words are quite appreciated in a community discussion. I think I am simply taking this chance to explore and exploit the current decision tendency when it comes to a borderline proposal. Specifically, when we (potentially) reach the Pareto optimality of a proposal, it is interesting to see the factors that determine a decision choice between all found Pareto frontiers. Compared to a controversial arena proposal, the latest comment was to add it under an experiment flag other than a similar decision of "likely decline." While I agree they might not be a completely fair comparison, this still reveals more observed potentially contradicting behaviors. In terms of this proposal, I'd say there are no actual hard feelings but simply observing a choice after an explorative discussion. Subsequently, this made me an idea of having an official field experiment location might be interesting, where every controversial or suboptimal proposal could encourage people to contribute to the implementation. That place has no guarantee of any kind. Nevertheless, I'll stop here as it is completely not the topic of this proposal discussion. |
Okay, so perhaps more examples would help. Take https://github.com/frankban/quicktest, which led to accepted proposals like #41260 and #32111.
It really is comparing apples and oranges. It is literally impossible to implement a runtime change like arenas outside of the Go source tree. It is of course within your right to stick to your arguments - my argument was that you would have a better chance of success if you emulate success stories like go-cmp or quicktest. It takes a lot more patience and effort to do it that way, but that's what it takes to get APIs right and write good proposals with detail and user experience. |
Agree. On behalf of the proposal, I think we could continue arguing this, such as those examples exist, but are still rare compared to the declined ones, etc. :)
This part I am not sure about. It might be more suitable to say comparing ants and elephants. From the discussion there, it seems it remains possible to implement the arena in a third package.
I do not object to this argument at all. That's why the proposal summarizes many existing practices from pkg.go.dev. |
I don't think I will be able to convince you, because you seem resolute that it is a good idea to have eventually such a function in the standard library but I still see problems. But you don't know at which depth such types may appear in a datastructure. Especially if these are unexported types that appear in unexported fields. So even if deep-copying is possible it might not be a good idea in the general case to do it indiscriminately. Now in the few cases where it is desirable, it shouldn't be difficult to write the copying function by hand a priori. Note that comparing values does not suffer from this because a comparison is not effectful in general. Which is why having deepEqual is less a problem. I think that warrants caution and while I appreciate that you mentioned a few edge cases in the documentation, I think the issue can be more pervasive and seem underestimated. |
Hi, Having a function such as Deep - Copy which is again shared by many popular languages such as Python,JAVA in itself is a very good valid argument to have the feature. In my opinion i don't see any drawbacks in language having such a feature in a standard library. The issue as of now in my opinion seems to be more of a developer shortage one to implement such a feature in the library. But again the Benefits far outweigh the cons is what i am able to infer from here. |
I do like the Apples and Oranges references here.. |
No change in consensus, so declined. |
Previous proposal updates
As a duality complement of `DeepEqual`, I propose to add a new API for deep copy:DeepCopy may likely be very commonly used. Implement it in reflect package may receive better runtime performance.
The frist version seems more preferrable because type parameters permits compile time type checking, but the others might also be optimal.
The proposed document of the API is preliminary.
Update:
Based on the discussions until #51520 (comment), the documented behavior could be:
Depending on the design choice, the content included and after exceptions could decide on different behaviors, see
#51520 (comment), and #51520 (comment) as examples.
The API signature could select one of the following possible options. The main differences between them include two aspects:
Either version is optimal, and purely depending on the final choice.
Update (until #51520 (comment)):
I propose to add a new API for deep copy:
Here is an implementation for trying out: https://pkg.go.dev/golang.design/x/reflect
The text was updated successfully, but these errors were encountered: