nexmon – Blame information for rev 1

Subversion Repositories:
Rev:
Rev Author Line No. Line
1 office 1 <!-- source: socat-exec.html -->
2 <!-- Copyright Gerhard Rieger 2009 -->
3 <html><head>
4 <title>Socat address chains</title>
5 <link rel="stylesheet" type="text/css" href="dest-unreach.css">
6 </head>
7  
8 <body>
9  
10 <h1>Socat address chains</h1>
11  
12 <a name="introduction"/>
13 <h2>Introduction</h2>
14 <p>Socat version 2 can concatenate multiple modules and transfer data between
15 them bidirectionally.
16 <p>
17  
18  
19 <a name="example1"/>
20 <h2>Example 1: OpenSSL via HTTP proxy</h2>
21  
22 <span class="frame"><span class="shell">
23 socat - "OPENSSL,verify=0 | PROXY:secure.domain.com:443 | TCP:proxy.domain.com:8080"
24 </span></span>
25  
26 <p>This command does the following: socat connects to proxy.domain.com on port
27 8080 and sends a proxy CONNECT request for secure.domain.com port 443; this is
28 similar to the proxy address available in version 1. Once the proxy server
29 acknowledges successful
30 connection to the target (SSL) server, socat starts SSL negotiation and then
31 transfers data between its stdio and the SSL server.
32 </p>
33  
34  
35 <a name="basics"/>
36 <h2>Address chain basics</h2>
37  
38 <p>socat version 1 was able to open two addresses and transfer data between
39 them. "Addresses" could be just sockets or other file descriptors, or could
40 be a little more complex like proxy client or OpenSSL server and client. It
41 was, though desirable, practically not possible to combine complex address
42 types, or to use other socket types than the predefined ones (usually TCP)
43 with complex addresses.
44 </p>
45 <p>socat version 2 has been designed to overcome these limitations. First, the
46 complex address types are now separated from the underlying file descriptor
47 types. Second, complex addresses that are now called <em>inter addresses</em>
48 can be concatenated to an <em>address chain</em>; however, an <em>endpoint
49 address</em> that just provides file descriptors must be the last component
50 of an address chain.
51 </p>
52 <p>The socat invocation takes two address chains, opens them, and transfers
53 data between them.
54 </p>
55 <p>An address chain consists of zero or more inter addresses and one endpoint
56 address, all separated by the pipe character '|'. When starting socat from
57 the command line these characters and the optional spaces must be protected
58 from the shell; it is recommended to put each address chain under double
59 quotes.
60 </p>
61 <p>The (bidirectional) inter addresses that are available with a socat
62 implementation can be listed with the following command:
63 </p>
64 <span class="frame"><span class="shell">
65 socat -h |egrep 'b ..b groups='</span></span>
66 <p>A full socat 2.0.0-b3 program provides the following inter addresses:
67 </p>
68 <table border=1>
69 <tr><th>name</th><th>description</th></tr>
70 <tr><td>NOP</td><td>transfers data unmodified</td></tr>
71 <tr><td>OPENSSL-CLIENT</td><td>performs OpenSSL client negotiation, then
72 encrypts/decrypts data</td></tr>
73 <tr><td>OPENSSL-SERVER</td><td>performs OpenSSL server negotiation, then
74 encrypts/decrypts data </td></tr>
75 <tr><td>PROXY</td><td>performs proxy CONNECT client negotiation, then
76 transfers data unmodified</td></tr>
77 <tr><td>SOCKS4</td><td>performs socks 4 client negotiation, then
78 transfers data unmodified</td></tr>
79 <tr><td>SOCKS4A</td><td>performs socks 4a client negotiation, then
80 transfers data unmodified</td></tr>
81 <tr><td>SOCKS5</td><td>performs socks 5 TCP client negotiation, then
82 transfers data unmodified</td></tr>
83 <tr><td>TEST</td><td>appends &gt; to forward, and &lt; to reversely
84 transferred blocks</td></tr>
85 <tr><td>EXEC</td><td>invokes a program
86 (see <a href="socat-exec.html">socat-exec.html</a>), then transfers data unmodified</td></tr>
87 <tr><td>SYSTEM</td><td>invokes the shell (see <a href="socat-exec.html">socat-exec.html</a>), then transfers data unmodified</td></tr>
88  
89 </table>
90  
91  
92 <a name="reverse"/>
93 <h2>Reverse address use</h2>
94  
95 <p>Inter addresses have two interfaces. In most cases one of
96 these can be seen as a <em>data</em> interface, where arbitrary data
97 traffic may occur, and the other as <em>protocol</em> interface where the
98 transferred data has to follow some rules like socks and HTTP protocol, or
99 valid encryption.
100 </p>
101 <p>Bidirectional inter addresses are usually implemented such that their data
102 interface is on the "left" side, and the protocol interface on the "right"
103 side.
104 </p>
105 <p>It may be convenient to build an address chain where one or more inter
106 addresses work in the reverse direction, so their protocol side is connected
107 to left neighbor in the chain using the protocol, and the data side is
108 connected to the right neighbor for raw data transfer. socat allows to use
109 inter addresses in <em>reverse</em> direction by preceding their keyword with
110 &circ;.
111 </p>
112  
113  
114 <a name="example2"/>
115 <h2>Example 2:</h2>
116  
117 <p>Endpoint addresses that fork should usually build the first socat address
118 chain, without inter addresses. For creating an SSL to TCP gateway that
119 handles multiple connections the following command line does the job:
120 </p>
121 <span class="frame"><span class="shell">
122 socat TCP-LISTEN:443,reuseaddr,fork "^OPENSSL-SERVER,cert=server.pem | TCP:somehost:80"
123 </span></span>
124  
125 <p>Without the reverse usage of the SSL server address, socat would "speak"
126 clear text with the clients that connected to its left address, and SSL to
127 somehost.
128 </p>
129  
130  
131 <a name="unidirectional"/>
132 <h2>Unidirectional data transfer</h2>
133  
134 <p>Like in socat version 1, it is possible to specify unidirectional transfers
135 with version 2. Use socat options <a href="socat.html#OPTION_u">-u</a> or
136 <a href="socat.html#OPTION_U">-U</a>.
137 </p>
138 <p>Unidirectional transfer must be supported by the involved inter addresses;
139 e.g., SSL requires a bidirectional channel for negotiation of encryption
140 parameters etc.
141 </p>
142 <p>It is possible to mix uni- and bidirectional transfers within one address
143 chain: Think of a simple file transfer over SSL.
144 </p>
145 <p>The socat help function can tell us which address types support which kinds
146 of transfer:</p>
147 <span class="frame"><span class="shell">
148 socat -h |egrep 'openssl-server'</span></span>
149 <p>gives the following output:
150 </p>
151 <p><pre> openssl-server rwb b groups=CHILD,RETRY,OPENSSL
152 openssl-server:&lt;port&gt; rwb groups=FD,SOCKET,LISTEN,CHILD,RETRY,RANGE,IP4,IP6,TCP,OPENSSL</pre>
153 </p>
154 <p>The <tt>rwb &nbsp; b</tt> flags mean that this address type can handle readonly,
155 writeonly, and bidirectional transfers on its left (data) side, but only
156 bidirectional on its right (protocol) side.
157 </p>
158 <p>The second line describes the (version 1) endpoint form: no right side
159 traffic kinds are specified because this address type establishes its protocol
160 communication itself.
161 </p>
162  
163 <a name="dual"/>
164 <h2>Dual inter addresses</h2>
165  
166 <p>In socat version 1 it was already possible to combine two unidirectional
167 addresses to one bidirectional address. This idea has been extended in version
168 2: Two unidirectional inter addresses can be combined to one bidirectional
169 transfer unit.
170 </p>
171 <p><em>Note: in version 1, the dual specification was like
172 </em><tt>righttoleft!!lefttoright</tt><em>. In version 2, it is:
173 </em><tt>lefttoright%righttoleft</tt><em>. This is the only major incompatibility
174 between versions 1 and 2.</em>
175 </p>
176 <p>With the few already available inter address types, this feature has no
177 practical use except with <a href="socat-exec.html">exec and system</a> type
178 addresses. However, the general function shall be described using the
179 hypothetical inter address types <tt>gzip</tt> and <tt>gunzip</tt>.
180 </p>
181 <p>Let us design these inter address types: <tt>gzip</tt> is a module that
182 reads arbitrary data on its left ("data") side, compresses it, and writes the
183 compressed data to its right (protocol side) neighbor.
184 <!-- Data that arrives onits right side is uncompressed and passed to the
185 left neighbor. -->
186 </p>
187 <p><tt>gunzip</tt> reads gzip compressed data on its left side and writes the
188 raw uncompressed data on its right side.
189 </p>
190 <p>socat can combine these to provide a bidirectional compress/decompress
191 function:<br>
192 <tt>gzip%gunzip</tt>
193 </p>
194 <p>Data coming from the left is passed through gzip and sent to the right;
195 data coming from the right is passed through gunzip and sent to the left.
196 </p>
197 <p>When the reverse functionality is desired this arrangement does the job:<br>
198 <tt>gunzip%gzip</tt>
199 </p>
200  
201  
202 <a name="fork"/>
203 <h2>fork</h2>
204  
205 <p>socat provides the <tt>fork</tt> address option for uses like network
206 servers where multiple clients can connect and are handled in parallel in
207 different socat sub processes.
208 </p>
209 <p>When the sub processes should work independently (share no socat file
210 descriptors) the fork option must be applied to the last component of the
211 first address chain. For better readability it is advisable to have only the
212 "left" endpoint address in the left chain and put all intermediate addresses
213 into the right chain.
214 </p>
215  
216  
217 <a name="understanding"/>
218 <h2>Understanding chain implementation</h2>
219  
220 <p>The idea of concatenated modules in socat is not new. But a few attempts to
221 completely rewrite and enhance the socat transfer engine
222 were never completed. At last, it was decided to choose an approach that
223 requires only moderate changes to socats transfer engine and the existing
224 address types.
225 </p>
226 <p>Think of several socat1 like processes somehow combined - with an abstract
227 operator || :
228 </p>
229 <span class="frame"><span class="shell">
230 socat - openssl || socat - proxy:secure.domain.com || socat - tcp:proxy.domain.com:8080
231 </span></span>
232 <p>The solution was to put all these into one process but have each socat engine
233 run in its own thread. The transfer between the engines goes over socket
234 pairs, so the engines see file descriptors as usual. The main work then was
235 to implement the functionality for opening address chains which includes
236 parsing, creating socket pairs and threads, combining the addresses, taking
237 care of unidirectional, dual, and reverse addresses etc.
238 </p>
239 <p>Here is the socat version 2 command line of example 1:<br>
240 <tt>socat - "OPENSSL,verify=0 | PROXY:secure.domain.com:443 | TCP:proxy.domain.com:8080"</tt>
241 <p>A schematic representation of how this is realized in socat:<br>
242 <tt>STDIO - engine[thread 0] - OPENSSL - socket pair - (FD) - engine[thread 1]
243 - PROXY - socket pair - (FD) - engine[thread 2] - TCP</tt>
244 </p>
245 <p>where FD means a trivial address similar to the FD (file descriptor) address
246 type.
247 </p>
248 <p>For debugging address chains it proved useful to write down two lines and to note the actual file descriptor numbers:</p>
249 <pre> STDIO ^ OPENSSL | ^ PROXY | ^ TCP
250 0,1 ^ 6 | 7 ^ 4 | 5 ^ 3</pre>
251 <p>The symbol <b>&circ;</b> means a socat transfer engine.
252 </p>
253  
254 <p>Now the implementation of the reverse address feature should be easier to
255 understand. While a forward address is put to the right side of its
256 engine, a reverse address is just put to the left side. Example 2 can be
257 explained so:
258 </p>
259 <p>Example 2 command line:<br>
260 <tt>socat TCP-LISTEN:443,reuseaddr,fork "^OPENSSL-SERVER,cert=server.pem |
261 TCP:somehost:80"</tt>
262 </p>
263 <p>Schematic representation:<br>
264 <tt>TCP-LISTEN - engine[thread 0] - (FD) - socket pair - OPENSSL-SERVER -
265 engine[thread 1] - TCP</tt>
266 </p>
267 <p>Debug schema:<br>
268 <pre>
269 TCP-L ^ | SSL-SERV ^ TCP
270 3 ^ 5 | 6 ^ 4</pre>
271  
272  
273 <a name="commtypes"/>
274 <h2>Communication types</h2>
275  
276 <p>For communication between the address modules of consecutive transfer
277 engines socat provides pairs (or quadruples) of file descriptors. You may
278 think about these as two normal UNIX pipes (fifos), one for left-to-right and
279 the other for right-to-left data transfer.
280 </p>
281 <p>There are a few requirements that these file descriptors should fulfill,
282 however they are different depending on the libraries used by the inter
283 address modules (e.g. libopenssl) or by external programs that are involved
284 (see <a href="socat-exec.html">socat-exec.html</a>).
285 </p>
286 <p>The factors to consider for these file dscriptors are:
287 </p>
288 <ul>
289 <li>Half close: when a module terminates communication on its write channel,
290 its read channel should still stay open.</li>
291 <li>Half close method: A module might half close a connection
292 using <tt>close()</tt> or <tt>shutdown()</tt> methods.</li>
293 <li>Buffering: The output buffering behaviour of some modules can be
294 influenced by the type of file descriptor</li>
295 <li>INET: Some external programs require a TCP/IPv4 file descriptor</li>
296 </ul>
297 <p>This table lists the available communication types and their
298 properties:</p>
299 <table border=1>
300 <tr><th>comm.type</th><th>half close with close()</th><th>allows shutdown</th><th>avoids buffering</th><th>TCP/IPv4</th></tr>
301 <tr><td>socketpairs</td><td>OK</td><td>OK</td><td>no</td><td>no</td></tr>
302 <tr><td>socketpair</td><td>no</td><td>OK</td><td>no</td><td>no</td></tr>
303 <tr><td>pipes</td><td>OK</td><td>no</td><td>no</td><td>no</td></tr>
304 <tr><td>ptys</td><td>OK</td><td>no</td><td>yes</td><td>no</td></tr>
305 <tr><td>tcp</td><td>no</td><td>yes</td><td>no</td><td>yes</td></tr>
306 </table>
307  
308 <p>The default is socketpairs.
309 </p>
310 <p>The overall communication type can be chosen using the <a href="socat.html#option_c"><tt>-c</tt></a> socat
311 option. With socat 2.0.0-b3 it is not possible to use different communication
312 types in one process (exception: right side of exec/system modules)
313 </p>
314  
315 <small>Copyright: Gerhard Rieger 2009</small><br>
316 <small>License: <a href="http://www.fsf.org/licensing/licenses/fdl.html">GNU Free Documentation License (FDL)</a></small>
317 </p>
318  
319 </body>
320 </html>