Skip to content

sock5 proxy support #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
advincze opened this issue Mar 16, 2017 · 6 comments
Closed

sock5 proxy support #70

advincze opened this issue Mar 16, 2017 · 6 comments

Comments

@advincze
Copy link
Contributor

i need to interact with HDFS inside a private subnet in AWS over a secure jumphost and am currently doing that over a sock5 proxy.

i submitted #62 to be able to setup the namenode connection and pass it to the client.

this works for the cases where i only need to talk to the metadata server.

However, if i want to write to a file, the blockwriter opens a non proxy connection, which does not work for me:

func (bw *BlockWriter) connectNext() error {
	address := getDatanodeAddress(bw.currentPipeline()[0])
	conn, err := net.DialTimeout("tcp", address, connectTimeout)
	if err != nil {
		return err
	}
...

here i need to setup the proxy again

	dialer, err := proxy.SOCKS5("tcp", "localhost:8157", nil, proxy.Direct)
	if err != nil {
		panic(err)
	}

	conn, err := dialer.Dial("tcp", address)
	if err != nil {
		return err
	}

I'd be happy to submit a patch but am not sure about what interface would be best here

the connection in both rpc.NewNamenodeConnection and rpc.BlockWriter.connectNext() are both using

conn, err := net.DialTimeout("tcp", address, connectTimeout)

so this could be generalized to use a

type DialerTimeout interface {
    DialTimeout(network, address string, timeout time.Duration) (net.Conn, error)
}

or a

type DialerTimeoutFunc func(network, address string, timeout time.Duration) (net.Conn, error)

but that could not be set in NewNamenodeConnection anymore before making the connection

Or, in NamenodeConnection I could add this field

type NamenodeConnection struct {
...
BlockWriterDialTimeout func(network, address string, timeout time.Duration) (net.Conn, error)
...

and rpc.BlockWriter could use it if not nil

or do you have a better suggestion?

@colinmarc
Copy link
Owner

Sorry for the late response here!

net already has an interface for this: net.Dialer. It should be an option for Client - and then used for both the namenode connection and the blockwriter connections - but we're getting to kind of an awkward place with so many options. Might be time to add a ClientOptions thingy, and deprecate the old constructors.

@colinmarc
Copy link
Owner

Sorry, as soon as I hit post I realized net.Dialer is a struct, not an interface. That makes it a bit trickier, but I think we can take net/https approach and have a DialFunc func(network, address string) (net.Conn, error) option.

(net.Dialer lets you specify a timeout which is used when you call Dial; DialTimeout is just a convenience method for that.)

@advincze
Copy link
Contributor Author

advincze commented Mar 27, 2017

i think we should also use rather net.DialContext since that is the recommended way now.

for the interface

we could do something like this (since New(....) is already taken):

type ClientOptions struct {
	Address     string
	Username    string
	DialFunc    func(ctx context.Context, network, addr string) (net.Conn, error)
}

func NewClient(options ClientOptions) (*Client, error)

alternatively i also like the functional options pattern that would looke something like this:

type ClientOption func(*Client) error

func Address(address string) ClientOption
func Username(username string) ClientOption
func DialFunc(dialFn func(ctx context.Context, network, addr string) (net.Conn, error)) ClientOption

func NewClient(options ..ClientOption) (*Client, error)

and it could be used like this:

func main(){
	dialer, err := proxy.SOCKS5("tcp", "localhost:8157", nil, proxy.Direct)
	if err != nil {
		panic(err)
	}


	hdfsCli, err := hdfs.NewClient(
		hdfs.Address("127.0.0.1:8020"), 
		hdfs.Username("hadoop"),
		hdfs.DialFunc(dialer.Dial),
	)

}

this does not entirely work for my usecase since proxy does not have DialContext yet , but it will eventually, and for the time being it can be wrapped

in both cases one would have to pass the DialFunc to the BlockWriter. It could be done by passing it to NamenodeConnection.

what do you think ?

@colinmarc
Copy link
Owner

The former sounds right - and you're totally correct about DialContext rather than Dial. Let me take a stab at the options thing without a dialfunc first.

@colinmarc
Copy link
Owner

Ok, this is going to be a bit painful. We need to thread the dialer down into:

  • BlockReader and ChecksumReader (and deprecate NewBlockReader, or something)
  • BlockWriter (this takes the NamenodeConnection, though, so the signature doesn't need to change there)
  • NamenodeConnection (which means deprecating NewNamenodeConnection)
  • and, of course, Client

@colinmarc
Copy link
Owner

With #77, there now exists a ClientOptions struct, so this should now be implementable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants