Skip to content

[SR-4036] Memory leak with String.cString(using: Encoding) #3885

@pushkarnk

Description

@pushkarnk
Member
Previous ID SR-4036
Radar rdar://problem/73153090
Original Reporter @pushkarnk
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 1
Component/s Foundation
Labels Bug
Assignee None
Priority Medium

md5: 342a1be011db14a8a68d02d0d35c641e

Issue Description:

This code leaks memory

for i in 1...100000 {
    let s: String = "test\(i)"
    let c = s.cString(using: .utf8)
    print(c)
}

Activity

pushkarnk

pushkarnk commented on Feb 22, 2017

@pushkarnk
MemberAuthor

On investigation, I could see that String.cString() on Linux calls NSString.cString() under the covers:
https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/NSStringAPI.swift#L371

Further, NSString.cString() malloc's memory and returns the address of that memory to the caller:
https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/NSString.swift#L834

String.cString() calls _persistCString that copies the contents of this malloc'd memory into a [CChar] and returns it.
https://github.com/apple/swift/blob/master/stdlib/public/core/CString.swift#L167

We do not free the memory that was malloc'd by NSString.cString() and it's clear that malloc'd memory does not come under the purview of the ARC. Looks like we need to call `free` explicitly.

pushkarnk

pushkarnk commented on Feb 22, 2017

@pushkarnk
MemberAuthor

With String.cString() we can attempt to free up the malloc'd memory after copying the contents to [CChar]. But with NSString.cString() this becomes the responsibility of the application.

pushkarnk

pushkarnk commented on Feb 22, 2017

@pushkarnk
MemberAuthor
belkadan

belkadan commented on Feb 22, 2017

@belkadan

On Apple platforms the returned string is autoreleased. I think the Foundation folks just didn't have a great answer on non-Apple platforms. cc @phausler

phausler

phausler commented on Feb 22, 2017

@phausler
Contributor

it is leaked; I have a prototype that would make a pseudo autorelease but that was considered to be "distasteful" 🙁

pushkarnk

pushkarnk commented on Feb 25, 2017

@pushkarnk
MemberAuthor

Comment from @parkera on the PR (which I've closed):

```
Hm, I'm not really a fan of this approach for a couple of reasons:
It makes this file different across darwin and Linux, because on darwin that C string is backed by an autoreleased NSData and it would be a serious error to free it.
It leaves the leak in place at the source of the problem, which is NSString's cString (and utf8string, which we do nothing about here).
It seems like we should elide cString and utf8String from NSString on Linux, because there is no choice but for them to leak, and implement this method in a way which is more canonical (perhaps data:usingEncoding: followed by a copy).
```

phausler

phausler commented on Feb 26, 2017

@phausler
Contributor

http://swift.sandbox.bluemix.net/#/repl/58b24367f008f226a7910246

That would be a way we could have autorelease work on linux. And the NSData backing could be done as

internal func _bytesInEncoding(_ str: NSString, _ encoding: String.Encoding, _ fatalOnError: Bool, _ externalRep: Bool, _ lossy: Bool) -> UnsafePointer<Int8>? {
    let theRange = NSMakeRange(0, str.length)
    var cLength = 0
    var used = 0
    var options: NSString.EncodingConversionOptions = []
    if externalRep {
        options.formUnion(.externalRepresentation)
    }
    if lossy {
        options.formUnion(.allowLossy)
    }
    if !str.getBytes(nil, maxLength: Int.max - 1, usedLength: &cLength, encoding: encoding.rawValue, options: options, range: theRange, remaining: nil) {
        if fatalOnError {
            fatalError("Conversion on encoding failed")
        }
        return nil
    }
    
    let buffer = malloc(cLength + 1)!.bindMemory(to: Int8.self, capacity: cLength + 1)
    if !str.getBytes(buffer, maxLength: cLength, usedLength: &used, encoding: encoding.rawValue, options: options, range: theRange, remaining: nil) {
        fatalError("Internal inconsistency; previously claimed getBytes returned success but failed with similar invocation")
    }
    
    buffer.advanced(by: cLength).initialize(to: 0)
    let data = NSData(bytesNoCopy: buffer, length: cLength, deallocator: .custom( { (bytes, length) in
        free(bytes)
    }))
    Unmanaged<NSData>.passUnretained(data).autorelease()

    return UnsafePointer(buffer)
}
phausler

phausler commented on Feb 26, 2017

@phausler
Contributor

Of course the requirement for that code is that it would need to be housed in a autoreleasepool call else it would hard abort and it is worth noting that swift on linux does not have a root pool installed by default. We could at CF constructor time push a root pool and then with an atexit pop the pool. That would be a long stretch to accommodate for the handful of APIs that require autorelease behavior and it might be more reasonable to offer a new SPI:

extension NSString {
    internal func withUnsafeCString<Result>(encoding: UInt, apply: (UnsafePointer<Int8>?) throws -> Result) rethrows -> Result {
        let buffer = _bytesInEncoding(self, String.Encoding(rawValue: encoding), false, false, false)
        defer { free(buffer) }
        return apply(buffer)
    }
}

extension String {
    public func cString(using encoding: Encoding) -> [CChar]? {
        return withExtendedLifetime(_ns) { (s: NSString) -> [CChar]? in
            return s.withUnsafeCString(encoding: encoding.rawValue) { 
                return _persistCString($0)
            }
        }
    }
}
phausler

phausler commented on Feb 26, 2017

@phausler
Contributor

That all being said: the signature `public func cString(using encoding: Encoding) -> [CChar]?` is not exactly proper since it should be a fast-path accessor if the string is eight bit represented (it should return inner pointer) and the Array<CChar> will cause an allocation as well as a full copy of the contents.

pushkarnk

pushkarnk commented on Mar 1, 2017

@pushkarnk
MemberAuthor

@phausler Tested your second solution. As expected, it works well but it doesn't address the leak in NSString.cString().

weissi

weissi commented on Jan 8, 2019

@weissi
Contributor

@phausler/@parkera could we maybe deprecate String.cString(using🙂 ? Returning an interior pointer without lifetime management is just not that great, especially in Swift so I'd say let's deprecate it in favour of `withCString`

belkadan

belkadan commented on Jan 8, 2019

@belkadan

It's still needed occasionally for things like super.init, which can't be nested in a closure just yet.

kmahar

kmahar commented on Jan 13, 2021

@kmahar

I have nothing to add in terms of what the right way to fix this is, but I just wanted to call out that if this is not going to be fixed I think it should be made far more obvious that this is an issue. Deprecation, a note on the docs site, something.... would be great. I accidentally wrote some code using this method ~3 years ago when I was new to Swift and just figured out today that it's been causing memory leaks for Linux users. Because I have for the most part only developed and checked for memory leaks on macOS I never noticed it until now.

weissi

weissi commented on Jan 13, 2021

@weissi
Contributor

+1, we should definitively document how you're supposed to hold this API (or deprecate it but as Jordan points out there are some (IMHO rare) usecases)

typesanitizer

typesanitizer commented on Jan 13, 2021

@typesanitizer

@swift-ci create

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @parkera@phausler@weissi@belkadan@pushkarnk

        Issue actions

          [SR-4036] Memory leak with String.cString(using: Encoding) · Issue #3885 · swiftlang/swift-corelibs-foundation