BadVPN – Blame information for rev 1
?pathlinks?
Rev | Author | Line No. | Line |
---|---|---|---|
1 | office | 1 | Raw TCP/IP interface for lwIP |
2 | |||
3 | Authors: Adam Dunkels, Leon Woestenberg, Christiaan Simons |
||
4 | |||
5 | lwIP provides three Application Program's Interfaces (APIs) for programs |
||
6 | to use for communication with the TCP/IP code: |
||
7 | * low-level "core" / "callback" or "raw" API. |
||
8 | * higher-level "sequential" API. |
||
9 | * BSD-style socket API. |
||
10 | |||
11 | The raw API (sometimes called native API) is an event-driven API designed |
||
12 | to be used without an operating system that implements zero-copy send and |
||
13 | receive. This API is also used by the core stack for interaction between |
||
14 | the various protocols. It is the only API available when running lwIP |
||
15 | without an operating system. |
||
16 | |||
17 | The sequential API provides a way for ordinary, sequential, programs |
||
18 | to use the lwIP stack. It is quite similar to the BSD socket API. The |
||
19 | model of execution is based on the blocking open-read-write-close |
||
20 | paradigm. Since the TCP/IP stack is event based by nature, the TCP/IP |
||
21 | code and the application program must reside in different execution |
||
22 | contexts (threads). |
||
23 | |||
24 | The socket API is a compatibility API for existing applications, |
||
25 | currently it is built on top of the sequential API. It is meant to |
||
26 | provide all functions needed to run socket API applications running |
||
27 | on other platforms (e.g. unix / windows etc.). However, due to limitations |
||
28 | in the specification of this API, there might be incompatibilities |
||
29 | that require small modifications of existing programs. |
||
30 | |||
31 | ** Multithreading |
||
32 | |||
33 | lwIP started targeting single-threaded environments. When adding multi- |
||
34 | threading support, instead of making the core thread-safe, another |
||
35 | approach was chosen: there is one main thread running the lwIP core |
||
36 | (also known as the "tcpip_thread"). When running in a multithreaded |
||
37 | environment, raw API functions MUST only be called from the core thread |
||
38 | since raw API functions are not protected from concurrent access (aside |
||
39 | from pbuf- and memory management functions). Application threads using |
||
40 | the sequential- or socket API communicate with this main thread through |
||
41 | message passing. |
||
42 | |||
43 | As such, the list of functions that may be called from |
||
44 | other threads or an ISR is very limited! Only functions |
||
45 | from these API header files are thread-safe: |
||
46 | - api.h |
||
47 | - netbuf.h |
||
48 | - netdb.h |
||
49 | - netifapi.h |
||
50 | - pppapi.h |
||
51 | - sockets.h |
||
52 | - sys.h |
||
53 | |||
54 | Additionaly, memory (de-)allocation functions may be |
||
55 | called from multiple threads (not ISR!) with NO_SYS=0 |
||
56 | since they are protected by SYS_LIGHTWEIGHT_PROT and/or |
||
57 | semaphores. |
||
58 | |||
59 | Netconn or Socket API functions are thread safe against the |
||
60 | core thread but they are not reentrant at the control block |
||
61 | granularity level. That is, a UDP or TCP control block must |
||
62 | not be shared among multiple threads without proper locking. |
||
63 | |||
64 | If SYS_LIGHTWEIGHT_PROT is set to 1 and |
||
65 | LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT is set to 1, |
||
66 | pbuf_free() may also be called from another thread or |
||
67 | an ISR (since only then, mem_free - for PBUF_RAM - may |
||
68 | be called from an ISR: otherwise, the HEAP is only |
||
69 | protected by semaphores). |
||
70 | |||
71 | |||
72 | ** The remainder of this document discusses the "raw" API. ** |
||
73 | |||
74 | The raw TCP/IP interface allows the application program to integrate |
||
75 | better with the TCP/IP code. Program execution is event based by |
||
76 | having callback functions being called from within the TCP/IP |
||
77 | code. The TCP/IP code and the application program both run in the same |
||
78 | thread. The sequential API has a much higher overhead and is not very |
||
79 | well suited for small systems since it forces a multithreaded paradigm |
||
80 | on the application. |
||
81 | |||
82 | The raw TCP/IP interface is not only faster in terms of code execution |
||
83 | time but is also less memory intensive. The drawback is that program |
||
84 | development is somewhat harder and application programs written for |
||
85 | the raw TCP/IP interface are more difficult to understand. Still, this |
||
86 | is the preferred way of writing applications that should be small in |
||
87 | code size and memory usage. |
||
88 | |||
89 | All APIs can be used simultaneously by different application |
||
90 | programs. In fact, the sequential API is implemented as an application |
||
91 | program using the raw TCP/IP interface. |
||
92 | |||
93 | Do not confuse the lwIP raw API with raw Ethernet or IP sockets. |
||
94 | The former is a way of interfacing the lwIP network stack (including |
||
95 | TCP and UDP), the later refers to processing raw Ethernet or IP data |
||
96 | instead of TCP connections or UDP packets. |
||
97 | |||
98 | Raw API applications may never block since all packet processing |
||
99 | (input and output) as well as timer processing (TCP mainly) is done |
||
100 | in a single execution context. |
||
101 | |||
102 | --- Callbacks |
||
103 | |||
104 | Program execution is driven by callbacks functions, which are then |
||
105 | invoked by the lwIP core when activity related to that application |
||
106 | occurs. A particular application may register to be notified via a |
||
107 | callback function for events such as incoming data available, outgoing |
||
108 | data sent, error notifications, poll timer expiration, connection |
||
109 | closed, etc. An application can provide a callback function to perform |
||
110 | processing for any or all of these events. Each callback is an ordinary |
||
111 | C function that is called from within the TCP/IP code. Every callback |
||
112 | function is passed the current TCP or UDP connection state as an |
||
113 | argument. Also, in order to be able to keep program specific state, |
||
114 | the callback functions are called with a program specified argument |
||
115 | that is independent of the TCP/IP state. |
||
116 | |||
117 | The function for setting the application connection state is: |
||
118 | |||
119 | - void tcp_arg(struct tcp_pcb *pcb, void *arg) |
||
120 | |||
121 | Specifies the program specific state that should be passed to all |
||
122 | other callback functions. The "pcb" argument is the current TCP |
||
123 | connection control block, and the "arg" argument is the argument |
||
124 | that will be passed to the callbacks. |
||
125 | |||
126 | |||
127 | --- TCP connection setup |
||
128 | |||
129 | The functions used for setting up connections is similar to that of |
||
130 | the sequential API and of the BSD socket API. A new TCP connection |
||
131 | identifier (i.e., a protocol control block - PCB) is created with the |
||
132 | tcp_new() function. This PCB can then be either set to listen for new |
||
133 | incoming connections or be explicitly connected to another host. |
||
134 | |||
135 | - struct tcp_pcb *tcp_new(void) |
||
136 | |||
137 | Creates a new connection identifier (PCB). If memory is not |
||
138 | available for creating the new pcb, NULL is returned. |
||
139 | |||
140 | - err_t tcp_bind(struct tcp_pcb *pcb, ip_addr_t *ipaddr, |
||
141 | u16_t port) |
||
142 | |||
143 | Binds the pcb to a local IP address and port number. The IP address |
||
144 | can be specified as IP_ADDR_ANY in order to bind the connection to |
||
145 | all local IP addresses. |
||
146 | |||
147 | If another connection is bound to the same port, the function will |
||
148 | return ERR_USE, otherwise ERR_OK is returned. |
||
149 | |||
150 | - struct tcp_pcb *tcp_listen(struct tcp_pcb *pcb) |
||
151 | |||
152 | Commands a pcb to start listening for incoming connections. When an |
||
153 | incoming connection is accepted, the function specified with the |
||
154 | tcp_accept() function will be called. The pcb will have to be bound |
||
155 | to a local port with the tcp_bind() function. |
||
156 | |||
157 | The tcp_listen() function returns a new connection identifier, and |
||
158 | the one passed as an argument to the function will be |
||
159 | deallocated. The reason for this behavior is that less memory is |
||
160 | needed for a connection that is listening, so tcp_listen() will |
||
161 | reclaim the memory needed for the original connection and allocate a |
||
162 | new smaller memory block for the listening connection. |
||
163 | |||
164 | tcp_listen() may return NULL if no memory was available for the |
||
165 | listening connection. If so, the memory associated with the pcb |
||
166 | passed as an argument to tcp_listen() will not be deallocated. |
||
167 | |||
168 | - struct tcp_pcb *tcp_listen_with_backlog(struct tcp_pcb *pcb, u8_t backlog) |
||
169 | |||
170 | Same as tcp_listen, but limits the number of outstanding connections |
||
171 | in the listen queue to the value specified by the backlog argument. |
||
172 | To use it, your need to set TCP_LISTEN_BACKLOG=1 in your lwipopts.h. |
||
173 | |||
174 | - void tcp_accept(struct tcp_pcb *pcb, |
||
175 | err_t (* accept)(void *arg, struct tcp_pcb *newpcb, |
||
176 | err_t err)) |
||
177 | |||
178 | Specified the callback function that should be called when a new |
||
179 | connection arrives on a listening connection. |
||
180 | |||
181 | - err_t tcp_connect(struct tcp_pcb *pcb, ip_addr_t *ipaddr, |
||
182 | u16_t port, err_t (* connected)(void *arg, |
||
183 | struct tcp_pcb *tpcb, |
||
184 | err_t err)); |
||
185 | |||
186 | Sets up the pcb to connect to the remote host and sends the |
||
187 | initial SYN segment which opens the connection. |
||
188 | |||
189 | The tcp_connect() function returns immediately; it does not wait for |
||
190 | the connection to be properly setup. Instead, it will call the |
||
191 | function specified as the fourth argument (the "connected" argument) |
||
192 | when the connection is established. If the connection could not be |
||
193 | properly established, either because the other host refused the |
||
194 | connection or because the other host didn't answer, the "err" |
||
195 | callback function of this pcb (registered with tcp_err, see below) |
||
196 | will be called. |
||
197 | |||
198 | The tcp_connect() function can return ERR_MEM if no memory is |
||
199 | available for enqueueing the SYN segment. If the SYN indeed was |
||
200 | enqueued successfully, the tcp_connect() function returns ERR_OK. |
||
201 | |||
202 | |||
203 | --- Sending TCP data |
||
204 | |||
205 | TCP data is sent by enqueueing the data with a call to |
||
206 | tcp_write(). When the data is successfully transmitted to the remote |
||
207 | host, the application will be notified with a call to a specified |
||
208 | callback function. |
||
209 | |||
210 | - err_t tcp_write(struct tcp_pcb *pcb, const void *dataptr, u16_t len, |
||
211 | u8_t apiflags) |
||
212 | |||
213 | Enqueues the data pointed to by the argument dataptr. The length of |
||
214 | the data is passed as the len parameter. The apiflags can be one or more of: |
||
215 | - TCP_WRITE_FLAG_COPY: indicates whether the new memory should be allocated |
||
216 | for the data to be copied into. If this flag is not given, no new memory |
||
217 | should be allocated and the data should only be referenced by pointer. This |
||
218 | also means that the memory behind dataptr must not change until the data is |
||
219 | ACKed by the remote host |
||
220 | - TCP_WRITE_FLAG_MORE: indicates that more data follows. If this is omitted, |
||
221 | the PSH flag is set in the last segment created by this call to tcp_write. |
||
222 | If this flag is given, the PSH flag is not set. |
||
223 | |||
224 | The tcp_write() function will fail and return ERR_MEM if the length |
||
225 | of the data exceeds the current send buffer size or if the length of |
||
226 | the queue of outgoing segment is larger than the upper limit defined |
||
227 | in lwipopts.h. The number of bytes available in the output queue can |
||
228 | be retrieved with the tcp_sndbuf() function. |
||
229 | |||
230 | The proper way to use this function is to call the function with at |
||
231 | most tcp_sndbuf() bytes of data. If the function returns ERR_MEM, |
||
232 | the application should wait until some of the currently enqueued |
||
233 | data has been successfully received by the other host and try again. |
||
234 | |||
235 | - void tcp_sent(struct tcp_pcb *pcb, |
||
236 | err_t (* sent)(void *arg, struct tcp_pcb *tpcb, |
||
237 | u16_t len)) |
||
238 | |||
239 | Specifies the callback function that should be called when data has |
||
240 | successfully been received (i.e., acknowledged) by the remote |
||
241 | host. The len argument passed to the callback function gives the |
||
242 | amount bytes that was acknowledged by the last acknowledgment. |
||
243 | |||
244 | |||
245 | --- Receiving TCP data |
||
246 | |||
247 | TCP data reception is callback based - an application specified |
||
248 | callback function is called when new data arrives. When the |
||
249 | application has taken the data, it has to call the tcp_recved() |
||
250 | function to indicate that TCP can advertise increase the receive |
||
251 | window. |
||
252 | |||
253 | - void tcp_recv(struct tcp_pcb *pcb, |
||
254 | err_t (* recv)(void *arg, struct tcp_pcb *tpcb, |
||
255 | struct pbuf *p, err_t err)) |
||
256 | |||
257 | Sets the callback function that will be called when new data |
||
258 | arrives. The callback function will be passed a NULL pbuf to |
||
259 | indicate that the remote host has closed the connection. If |
||
260 | there are no errors and the callback function is to return |
||
261 | ERR_OK, then it must free the pbuf. Otherwise, it must not |
||
262 | free the pbuf so that lwIP core code can store it. |
||
263 | |||
264 | - void tcp_recved(struct tcp_pcb *pcb, u16_t len) |
||
265 | |||
266 | Must be called when the application has received the data. The len |
||
267 | argument indicates the length of the received data. |
||
268 | |||
269 | |||
270 | --- Application polling |
||
271 | |||
272 | When a connection is idle (i.e., no data is either transmitted or |
||
273 | received), lwIP will repeatedly poll the application by calling a |
||
274 | specified callback function. This can be used either as a watchdog |
||
275 | timer for killing connections that have stayed idle for too long, or |
||
276 | as a method of waiting for memory to become available. For instance, |
||
277 | if a call to tcp_write() has failed because memory wasn't available, |
||
278 | the application may use the polling functionality to call tcp_write() |
||
279 | again when the connection has been idle for a while. |
||
280 | |||
281 | - void tcp_poll(struct tcp_pcb *pcb, |
||
282 | err_t (* poll)(void *arg, struct tcp_pcb *tpcb), |
||
283 | u8_t interval) |
||
284 | |||
285 | Specifies the polling interval and the callback function that should |
||
286 | be called to poll the application. The interval is specified in |
||
287 | number of TCP coarse grained timer shots, which typically occurs |
||
288 | twice a second. An interval of 10 means that the application would |
||
289 | be polled every 5 seconds. |
||
290 | |||
291 | |||
292 | --- Closing and aborting connections |
||
293 | |||
294 | - err_t tcp_close(struct tcp_pcb *pcb) |
||
295 | |||
296 | Closes the connection. The function may return ERR_MEM if no memory |
||
297 | was available for closing the connection. If so, the application |
||
298 | should wait and try again either by using the acknowledgment |
||
299 | callback or the polling functionality. If the close succeeds, the |
||
300 | function returns ERR_OK. |
||
301 | |||
302 | The pcb is deallocated by the TCP code after a call to tcp_close(). |
||
303 | |||
304 | - void tcp_abort(struct tcp_pcb *pcb) |
||
305 | |||
306 | Aborts the connection by sending a RST (reset) segment to the remote |
||
307 | host. The pcb is deallocated. This function never fails. |
||
308 | |||
309 | ATTENTION: When calling this from one of the TCP callbacks, make |
||
310 | sure you always return ERR_ABRT (and never return ERR_ABRT otherwise |
||
311 | or you will risk accessing deallocated memory or memory leaks! |
||
312 | |||
313 | |||
314 | If a connection is aborted because of an error, the application is |
||
315 | alerted of this event by the err callback. Errors that might abort a |
||
316 | connection are when there is a shortage of memory. The callback |
||
317 | function to be called is set using the tcp_err() function. |
||
318 | |||
319 | - void tcp_err(struct tcp_pcb *pcb, void (* err)(void *arg, |
||
320 | err_t err)) |
||
321 | |||
322 | The error callback function does not get the pcb passed to it as a |
||
323 | parameter since the pcb may already have been deallocated. |
||
324 | |||
325 | |||
326 | --- UDP interface |
||
327 | |||
328 | The UDP interface is similar to that of TCP, but due to the lower |
||
329 | level of complexity of UDP, the interface is significantly simpler. |
||
330 | |||
331 | - struct udp_pcb *udp_new(void) |
||
332 | |||
333 | Creates a new UDP pcb which can be used for UDP communication. The |
||
334 | pcb is not active until it has either been bound to a local address |
||
335 | or connected to a remote address. |
||
336 | |||
337 | - void udp_remove(struct udp_pcb *pcb) |
||
338 | |||
339 | Removes and deallocates the pcb. |
||
340 | |||
341 | - err_t udp_bind(struct udp_pcb *pcb, ip_addr_t *ipaddr, |
||
342 | u16_t port) |
||
343 | |||
344 | Binds the pcb to a local address. The IP-address argument "ipaddr" |
||
345 | can be IP_ADDR_ANY to indicate that it should listen to any local IP |
||
346 | address. The function currently always return ERR_OK. |
||
347 | |||
348 | - err_t udp_connect(struct udp_pcb *pcb, ip_addr_t *ipaddr, |
||
349 | u16_t port) |
||
350 | |||
351 | Sets the remote end of the pcb. This function does not generate any |
||
352 | network traffic, but only set the remote address of the pcb. |
||
353 | |||
354 | - err_t udp_disconnect(struct udp_pcb *pcb) |
||
355 | |||
356 | Remove the remote end of the pcb. This function does not generate |
||
357 | any network traffic, but only removes the remote address of the pcb. |
||
358 | |||
359 | - err_t udp_send(struct udp_pcb *pcb, struct pbuf *p) |
||
360 | |||
361 | Sends the pbuf p. The pbuf is not deallocated. |
||
362 | |||
363 | - void udp_recv(struct udp_pcb *pcb, |
||
364 | void (* recv)(void *arg, struct udp_pcb *upcb, |
||
365 | struct pbuf *p, |
||
366 | ip_addr_t *addr, |
||
367 | u16_t port), |
||
368 | void *recv_arg) |
||
369 | |||
370 | Specifies a callback function that should be called when a UDP |
||
371 | datagram is received. |
||
372 | |||
373 | |||
374 | --- System initalization |
||
375 | |||
376 | A truly complete and generic sequence for initializing the lwIP stack |
||
377 | cannot be given because it depends on additional initializations for |
||
378 | your runtime environment (e.g. timers). |
||
379 | |||
380 | We can give you some idea on how to proceed when using the raw API. |
||
381 | We assume a configuration using a single Ethernet netif and the |
||
382 | UDP and TCP transport layers, IPv4 and the DHCP client. |
||
383 | |||
384 | Call these functions in the order of appearance: |
||
385 | |||
386 | - lwip_init() |
||
387 | |||
388 | Initialize the lwIP stack and all of its subsystems. |
||
389 | |||
390 | - netif_add(struct netif *netif, const ip4_addr_t *ipaddr, |
||
391 | const ip4_addr_t *netmask, const ip4_addr_t *gw, |
||
392 | void *state, netif_init_fn init, netif_input_fn input) |
||
393 | |||
394 | Adds your network interface to the netif_list. Allocate a struct |
||
395 | netif and pass a pointer to this structure as the first argument. |
||
396 | Give pointers to cleared ip_addr structures when using DHCP, |
||
397 | or fill them with sane numbers otherwise. The state pointer may be NULL. |
||
398 | |||
399 | The init function pointer must point to a initialization function for |
||
400 | your Ethernet netif interface. The following code illustrates its use. |
||
401 | |||
402 | err_t netif_if_init(struct netif *netif) |
||
403 | { |
||
404 | u8_t i; |
||
405 | |||
406 | for (i = 0; i < ETHARP_HWADDR_LEN; i++) { |
||
407 | netif->hwaddr[i] = some_eth_addr[i]; |
||
408 | } |
||
409 | init_my_eth_device(); |
||
410 | return ERR_OK; |
||
411 | } |
||
412 | |||
413 | For Ethernet drivers, the input function pointer must point to the lwIP |
||
414 | function ethernet_input() declared in "netif/etharp.h". Other drivers |
||
415 | must use ip_input() declared in "lwip/ip.h". |
||
416 | |||
417 | - netif_set_default(struct netif *netif) |
||
418 | |||
419 | Registers the default network interface. |
||
420 | |||
421 | - netif_set_link_up(struct netif *netif) |
||
422 | |||
423 | This is the hardware link state; e.g. whether cable is plugged for wired |
||
424 | Ethernet interface. This function must be called even if you don't know |
||
425 | the current state. Having link up and link down events is optional but |
||
426 | DHCP and IPv6 discover benefit well from those events. |
||
427 | |||
428 | - netif_set_up(struct netif *netif) |
||
429 | |||
430 | This is the administrative (= software) state of the netif, when the |
||
431 | netif is fully configured this function must be called. |
||
432 | |||
433 | - dhcp_start(struct netif *netif) |
||
434 | |||
435 | Creates a new DHCP client for this interface on the first call. |
||
436 | |||
437 | You can peek in the netif->dhcp struct for the actual DHCP status. |
||
438 | |||
439 | - sys_check_timeouts() |
||
440 | |||
441 | When the system is running, you have to periodically call |
||
442 | sys_check_timeouts() which will handle all timers for all protocols in |
||
443 | the stack; add this to your main loop or equivalent. |
||
444 | |||
445 | |||
446 | --- Optimization hints |
||
447 | |||
448 | The first thing you want to optimize is the lwip_standard_checksum() |
||
449 | routine from src/core/inet.c. You can override this standard |
||
450 | function with the #define LWIP_CHKSUM <your_checksum_routine>. |
||
451 | |||
452 | There are C examples given in inet.c or you might want to |
||
453 | craft an assembly function for this. RFC1071 is a good |
||
454 | introduction to this subject. |
||
455 | |||
456 | Other significant improvements can be made by supplying |
||
457 | assembly or inline replacements for htons() and htonl() |
||
458 | if you're using a little-endian architecture. |
||
459 | #define lwip_htons(x) <your_htons> |
||
460 | #define lwip_htonl(x) <your_htonl> |
||
461 | If you #define them to htons() and htonl(), you should |
||
462 | #define LWIP_DONT_PROVIDE_BYTEORDER_FUNCTIONS to prevent lwIP from |
||
463 | defining hton*/ntoh* compatibility macros. |
||
464 | |||
465 | Check your network interface driver if it reads at |
||
466 | a higher speed than the maximum wire-speed. If the |
||
467 | hardware isn't serviced frequently and fast enough |
||
468 | buffer overflows are likely to occur. |
||
469 | |||
470 | E.g. when using the cs8900 driver, call cs8900if_service(ethif) |
||
471 | as frequently as possible. When using an RTOS let the cs8900 interrupt |
||
472 | wake a high priority task that services your driver using a binary |
||
473 | semaphore or event flag. Some drivers might allow additional tuning |
||
474 | to match your application and network. |
||
475 | |||
476 | For a production release it is recommended to set LWIP_STATS to 0. |
||
477 | Note that speed performance isn't influenced much by simply setting |
||
478 | high values to the memory options. |
||
479 | |||
480 | For more optimization hints take a look at the lwIP wiki. |
||
481 | |||
482 | --- Zero-copy MACs |
||
483 | |||
484 | To achieve zero-copy on transmit, the data passed to the raw API must |
||
485 | remain unchanged until sent. Because the send- (or write-)functions return |
||
486 | when the packets have been enqueued for sending, data must be kept stable |
||
487 | after that, too. |
||
488 | |||
489 | This implies that *ALL* pbufs passed to send functions must *not* be reused by |
||
490 | the application unless the send function returns an error indicating the pbuf |
||
491 | is not sent/queued for sending. |
||
492 | |||
493 | |||
494 | Also, data passed to tcp_write without the copy-flag must not be changed until |
||
495 | sent and ACKed (check the amount of bytes marked as 'sent')! |
||
496 | |||
497 | Therefore, be careful which type of PBUF you use and if you copy TCP data |
||
498 | or not! |