WebSVN – BadVPN – Blame – Rev 1

1

office

1

Raw TCP/IP interface for lwIP

2

3

Authors: Adam Dunkels, Leon Woestenberg, Christiaan Simons

4

5

lwIP provides three Application Program's Interfaces (APIs) for programs

6

to use for communication with the TCP/IP code:

7

* low-level "core" / "callback" or "raw" API.

8

* higher-level "sequential" API.

9

* BSD-style socket API.

10

11

The raw API (sometimes called native API) is an event-driven API designed

12

to be used without an operating system that implements zero-copy send and

13

receive. This API is also used by the core stack for interaction between

14

the various protocols. It is the only API available when running lwIP

15

without an operating system.

16

17

The sequential API provides a way for ordinary, sequential, programs

18

to use the lwIP stack. It is quite similar to the BSD socket API. The

19

model of execution is based on the blocking open-read-write-close

20

paradigm. Since the TCP/IP stack is event based by nature, the TCP/IP

21

code and the application program must reside in different execution

22

contexts (threads).

23

24

The socket API is a compatibility API for existing applications,

25

currently it is built on top of the sequential API. It is meant to

26

provide all functions needed to run socket API applications running

27

on other platforms (e.g. unix / windows etc.). However, due to limitations

28

in the specification of this API, there might be incompatibilities

29

that require small modifications of existing programs.

** Multithreading

lwIP started targeting single-threaded environments. When adding multi-

34

threading support, instead of making the core thread-safe, another

35

approach was chosen: there is one main thread running the lwIP core

36

(also known as the "tcpip_thread"). When running in a multithreaded

37

environment, raw API functions MUST only be called from the core thread

38

since raw API functions are not protected from concurrent access (aside

39

from pbuf- and memory management functions). Application threads using

40

the sequential- or socket API communicate with this main thread through

41

message passing.

42

43

As such, the list of functions that may be called from

44

other threads or an ISR is very limited! Only functions

45

from these API header files are thread-safe:

- api.h

- netbuf.h

- netdb.h

- netifapi.h

- pppapi.h

- sockets.h

- sys.h

Additionaly, memory (de-)allocation functions may be

55

called from multiple threads (not ISR!) with NO_SYS=0

56

since they are protected by SYS_LIGHTWEIGHT_PROT and/or

57

semaphores.

58

59

Netconn or Socket API functions are thread safe against the

60

core thread but they are not reentrant at the control block

61

granularity level. That is, a UDP or TCP control block must

62

not be shared among multiple threads without proper locking.

63

64

If SYS_LIGHTWEIGHT_PROT is set to 1 and

65

LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT is set to 1,

66

pbuf_free() may also be called from another thread or

67

an ISR (since only then, mem_free - for PBUF_RAM - may

68

be called from an ISR: otherwise, the HEAP is only

69

protected by semaphores).

70

71

72

** The remainder of this document discusses the "raw" API. **

73

74

The raw TCP/IP interface allows the application program to integrate

75

better with the TCP/IP code. Program execution is event based by

76

having callback functions being called from within the TCP/IP

77

code. The TCP/IP code and the application program both run in the same

78

thread. The sequential API has a much higher overhead and is not very

79

well suited for small systems since it forces a multithreaded paradigm

80

on the application.

81

82

The raw TCP/IP interface is not only faster in terms of code execution

83

time but is also less memory intensive. The drawback is that program

84

development is somewhat harder and application programs written for

85

the raw TCP/IP interface are more difficult to understand. Still, this

86

is the preferred way of writing applications that should be small in

87

code size and memory usage.

88

89

All APIs can be used simultaneously by different application

90

programs. In fact, the sequential API is implemented as an application

91

program using the raw TCP/IP interface.

92

93

Do not confuse the lwIP raw API with raw Ethernet or IP sockets.

94

The former is a way of interfacing the lwIP network stack (including

95

TCP and UDP), the later refers to processing raw Ethernet or IP data

96

instead of TCP connections or UDP packets.

97

98

Raw API applications may never block since all packet processing

99

(input and output) as well as timer processing (TCP mainly) is done

100

in a single execution context.

--- Callbacks

Program execution is driven by callbacks functions, which are then

105

invoked by the lwIP core when activity related to that application

106

occurs. A particular application may register to be notified via a

107

callback function for events such as incoming data available, outgoing

108

data sent, error notifications, poll timer expiration, connection

109

closed, etc. An application can provide a callback function to perform

110

processing for any or all of these events. Each callback is an ordinary

111

C function that is called from within the TCP/IP code. Every callback

112

function is passed the current TCP or UDP connection state as an

113

argument. Also, in order to be able to keep program specific state,

114

the callback functions are called with a program specified argument

115

that is independent of the TCP/IP state.

116

117

The function for setting the application connection state is:

118

119

- void tcp_arg(struct tcp_pcb *pcb, void *arg)

120

121

Specifies the program specific state that should be passed to all

122

other callback functions. The "pcb" argument is the current TCP

123

connection control block, and the "arg" argument is the argument

124

that will be passed to the callbacks.

125

126

127

--- TCP connection setup

128

129

The functions used for setting up connections is similar to that of

130

the sequential API and of the BSD socket API. A new TCP connection

131

identifier (i.e., a protocol control block - PCB) is created with the

132

tcp_new() function. This PCB can then be either set to listen for new

133

incoming connections or be explicitly connected to another host.

134

135

- struct tcp_pcb *tcp_new(void)

136

137

Creates a new connection identifier (PCB). If memory is not

138

available for creating the new pcb, NULL is returned.

139

140

- err_t tcp_bind(struct tcp_pcb *pcb, ip_addr_t *ipaddr,

141

u16_t port)

142

143

Binds the pcb to a local IP address and port number. The IP address

144

can be specified as IP_ADDR_ANY in order to bind the connection to

145

all local IP addresses.

146

147

If another connection is bound to the same port, the function will

148

return ERR_USE, otherwise ERR_OK is returned.

149

150

- struct tcp_pcb *tcp_listen(struct tcp_pcb *pcb)

151

152

Commands a pcb to start listening for incoming connections. When an

153

incoming connection is accepted, the function specified with the

154

tcp_accept() function will be called. The pcb will have to be bound

155

to a local port with the tcp_bind() function.

156

157

The tcp_listen() function returns a new connection identifier, and

158

the one passed as an argument to the function will be

159

deallocated. The reason for this behavior is that less memory is

160

needed for a connection that is listening, so tcp_listen() will

161

reclaim the memory needed for the original connection and allocate a

162

new smaller memory block for the listening connection.

163

164

tcp_listen() may return NULL if no memory was available for the

165

listening connection. If so, the memory associated with the pcb

166

passed as an argument to tcp_listen() will not be deallocated.

167

168

- struct tcp_pcb *tcp_listen_with_backlog(struct tcp_pcb *pcb, u8_t backlog)

169

170

Same as tcp_listen, but limits the number of outstanding connections

171

in the listen queue to the value specified by the backlog argument.

172

To use it, your need to set TCP_LISTEN_BACKLOG=1 in your lwipopts.h.

173

174

- void tcp_accept(struct tcp_pcb *pcb,

175

err_t (* accept)(void *arg, struct tcp_pcb *newpcb,

176

err_t err))

177

178

Specified the callback function that should be called when a new

179

connection arrives on a listening connection.

180

181

- err_t tcp_connect(struct tcp_pcb *pcb, ip_addr_t *ipaddr,

182

u16_t port, err_t (* connected)(void *arg,

183

struct tcp_pcb *tpcb,

184

err_t err));

185

186

Sets up the pcb to connect to the remote host and sends the

187

initial SYN segment which opens the connection.

188

189

The tcp_connect() function returns immediately; it does not wait for

190

the connection to be properly setup. Instead, it will call the

191

function specified as the fourth argument (the "connected" argument)

192

when the connection is established. If the connection could not be

193

properly established, either because the other host refused the

194

connection or because the other host didn't answer, the "err"

195

callback function of this pcb (registered with tcp_err, see below)

196

will be called.

197

198

The tcp_connect() function can return ERR_MEM if no memory is

199

available for enqueueing the SYN segment. If the SYN indeed was

200

enqueued successfully, the tcp_connect() function returns ERR_OK.

201

202

203

--- Sending TCP data

204

205

TCP data is sent by enqueueing the data with a call to

206

tcp_write(). When the data is successfully transmitted to the remote

207

host, the application will be notified with a call to a specified

208

callback function.

209

210

- err_t tcp_write(struct tcp_pcb *pcb, const void *dataptr, u16_t len,

211

u8_t apiflags)

212

213

Enqueues the data pointed to by the argument dataptr. The length of

214

the data is passed as the len parameter. The apiflags can be one or more of:

215

- TCP_WRITE_FLAG_COPY: indicates whether the new memory should be allocated

216

for the data to be copied into. If this flag is not given, no new memory

217

should be allocated and the data should only be referenced by pointer. This

218

also means that the memory behind dataptr must not change until the data is

219

ACKed by the remote host

220

- TCP_WRITE_FLAG_MORE: indicates that more data follows. If this is omitted,

221

the PSH flag is set in the last segment created by this call to tcp_write.

222

If this flag is given, the PSH flag is not set.

223

224

The tcp_write() function will fail and return ERR_MEM if the length

225

of the data exceeds the current send buffer size or if the length of

226

the queue of outgoing segment is larger than the upper limit defined

227

in lwipopts.h. The number of bytes available in the output queue can

228

be retrieved with the tcp_sndbuf() function.

229

230

The proper way to use this function is to call the function with at

231

most tcp_sndbuf() bytes of data. If the function returns ERR_MEM,

232

the application should wait until some of the currently enqueued

233

data has been successfully received by the other host and try again.

234

235

- void tcp_sent(struct tcp_pcb *pcb,

236

err_t (* sent)(void *arg, struct tcp_pcb *tpcb,

237

u16_t len))

238

239

Specifies the callback function that should be called when data has

240

successfully been received (i.e., acknowledged) by the remote

241

host. The len argument passed to the callback function gives the

242

amount bytes that was acknowledged by the last acknowledgment.

243

244

245

--- Receiving TCP data

246

247

TCP data reception is callback based - an application specified

248

callback function is called when new data arrives. When the

249

application has taken the data, it has to call the tcp_recved()

250

function to indicate that TCP can advertise increase the receive

251

window.

252

253

- void tcp_recv(struct tcp_pcb *pcb,

254

err_t (* recv)(void *arg, struct tcp_pcb *tpcb,

255

struct pbuf *p, err_t err))

256

257

Sets the callback function that will be called when new data

258

arrives. The callback function will be passed a NULL pbuf to

259

indicate that the remote host has closed the connection. If

260

there are no errors and the callback function is to return

261

ERR_OK, then it must free the pbuf. Otherwise, it must not

262

free the pbuf so that lwIP core code can store it.

263

264

- void tcp_recved(struct tcp_pcb *pcb, u16_t len)

265

266

Must be called when the application has received the data. The len

267

argument indicates the length of the received data.

268

269

270

--- Application polling

271

272

When a connection is idle (i.e., no data is either transmitted or

273

received), lwIP will repeatedly poll the application by calling a

274

specified callback function. This can be used either as a watchdog

275

timer for killing connections that have stayed idle for too long, or

276

as a method of waiting for memory to become available. For instance,

277

if a call to tcp_write() has failed because memory wasn't available,

278

the application may use the polling functionality to call tcp_write()

279

again when the connection has been idle for a while.

280

281

- void tcp_poll(struct tcp_pcb *pcb,

282

err_t (* poll)(void *arg, struct tcp_pcb *tpcb),

283

u8_t interval)

284

285

Specifies the polling interval and the callback function that should

286

be called to poll the application. The interval is specified in

287

number of TCP coarse grained timer shots, which typically occurs

288

twice a second. An interval of 10 means that the application would

289

be polled every 5 seconds.

290

291

292

--- Closing and aborting connections

293

294

- err_t tcp_close(struct tcp_pcb *pcb)

295

296

Closes the connection. The function may return ERR_MEM if no memory

297

was available for closing the connection. If so, the application

298

should wait and try again either by using the acknowledgment

299

callback or the polling functionality. If the close succeeds, the

300

function returns ERR_OK.

301

302

The pcb is deallocated by the TCP code after a call to tcp_close().

303

304

- void tcp_abort(struct tcp_pcb *pcb)

305

306

Aborts the connection by sending a RST (reset) segment to the remote

307

host. The pcb is deallocated. This function never fails.

308

309

ATTENTION: When calling this from one of the TCP callbacks, make

310

sure you always return ERR_ABRT (and never return ERR_ABRT otherwise

311

or you will risk accessing deallocated memory or memory leaks!

312

313

314

If a connection is aborted because of an error, the application is

315

alerted of this event by the err callback. Errors that might abort a

316

connection are when there is a shortage of memory. The callback

317

function to be called is set using the tcp_err() function.

318

319

- void tcp_err(struct tcp_pcb *pcb, void (* err)(void *arg,

320

err_t err))

321

322

The error callback function does not get the pcb passed to it as a

323

parameter since the pcb may already have been deallocated.

--- UDP interface

The UDP interface is similar to that of TCP, but due to the lower

329

level of complexity of UDP, the interface is significantly simpler.

330

331

- struct udp_pcb *udp_new(void)

332

333

Creates a new UDP pcb which can be used for UDP communication. The

334

pcb is not active until it has either been bound to a local address

335

or connected to a remote address.

336

337

- void udp_remove(struct udp_pcb *pcb)

338

339

Removes and deallocates the pcb.

340

341

- err_t udp_bind(struct udp_pcb *pcb, ip_addr_t *ipaddr,

342

u16_t port)

343

344

Binds the pcb to a local address. The IP-address argument "ipaddr"

345

can be IP_ADDR_ANY to indicate that it should listen to any local IP

346

address. The function currently always return ERR_OK.

347

348

- err_t udp_connect(struct udp_pcb *pcb, ip_addr_t *ipaddr,

349

u16_t port)

350

351

Sets the remote end of the pcb. This function does not generate any

352

network traffic, but only set the remote address of the pcb.

353

354

- err_t udp_disconnect(struct udp_pcb *pcb)

355

356

Remove the remote end of the pcb. This function does not generate

357

any network traffic, but only removes the remote address of the pcb.

358

359

- err_t udp_send(struct udp_pcb *pcb, struct pbuf *p)

360

361

Sends the pbuf p. The pbuf is not deallocated.

362

363

- void udp_recv(struct udp_pcb *pcb,

364

void (* recv)(void *arg, struct udp_pcb *upcb,

struct pbuf *p,

ip_addr_t *addr,

u16_t port),

void *recv_arg)

Specifies a callback function that should be called when a UDP

371

datagram is received.

372

373

374

--- System initalization

375

376

A truly complete and generic sequence for initializing the lwIP stack

377

cannot be given because it depends on additional initializations for

378

your runtime environment (e.g. timers).

379

380

We can give you some idea on how to proceed when using the raw API.

381

We assume a configuration using a single Ethernet netif and the

382

UDP and TCP transport layers, IPv4 and the DHCP client.

383

384

Call these functions in the order of appearance:

- lwip_init()

Initialize the lwIP stack and all of its subsystems.

389

390

- netif_add(struct netif *netif, const ip4_addr_t *ipaddr,

391

const ip4_addr_t *netmask, const ip4_addr_t *gw,

392

void *state, netif_init_fn init, netif_input_fn input)

393

394

Adds your network interface to the netif_list. Allocate a struct

395

netif and pass a pointer to this structure as the first argument.

396

Give pointers to cleared ip_addr structures when using DHCP,

397

or fill them with sane numbers otherwise. The state pointer may be NULL.

398

399

The init function pointer must point to a initialization function for

400

your Ethernet netif interface. The following code illustrates its use.

401

402

err_t netif_if_init(struct netif *netif)

{

u8_t i;

for (i = 0; i < ETHARP_HWADDR_LEN; i++) {

407

netif->hwaddr[i] = some_eth_addr[i];

408

}

409

init_my_eth_device();

return ERR_OK;

}

For Ethernet drivers, the input function pointer must point to the lwIP

414

function ethernet_input() declared in "netif/etharp.h". Other drivers

415

must use ip_input() declared in "lwip/ip.h".

416

417

- netif_set_default(struct netif *netif)

418

419

Registers the default network interface.

420

421

- netif_set_link_up(struct netif *netif)

422

423

This is the hardware link state; e.g. whether cable is plugged for wired

424

Ethernet interface. This function must be called even if you don't know

425

the current state. Having link up and link down events is optional but

426

DHCP and IPv6 discover benefit well from those events.

427

428

- netif_set_up(struct netif *netif)

429

430

This is the administrative (= software) state of the netif, when the

431

netif is fully configured this function must be called.

432

433

- dhcp_start(struct netif *netif)

434

435

Creates a new DHCP client for this interface on the first call.

436

437

You can peek in the netif->dhcp struct for the actual DHCP status.

438

439

- sys_check_timeouts()

440

441

When the system is running, you have to periodically call

442

sys_check_timeouts() which will handle all timers for all protocols in

443

the stack; add this to your main loop or equivalent.

444

445

446

--- Optimization hints

447

448

The first thing you want to optimize is the lwip_standard_checksum()

449

routine from src/core/inet.c. You can override this standard

450

function with the #define LWIP_CHKSUM <your_checksum_routine>.

451

452

There are C examples given in inet.c or you might want to

453

craft an assembly function for this. RFC1071 is a good

454

introduction to this subject.

455

456

Other significant improvements can be made by supplying

457

assembly or inline replacements for htons() and htonl()

458

if you're using a little-endian architecture.

459

#define lwip_htons(x) <your_htons>

460

#define lwip_htonl(x) <your_htonl>

461

If you #define them to htons() and htonl(), you should

462

#define LWIP_DONT_PROVIDE_BYTEORDER_FUNCTIONS to prevent lwIP from

463

defining hton*/ntoh* compatibility macros.

464

465

Check your network interface driver if it reads at

466

a higher speed than the maximum wire-speed. If the

467

hardware isn't serviced frequently and fast enough

468

buffer overflows are likely to occur.

469

470

E.g. when using the cs8900 driver, call cs8900if_service(ethif)

471

as frequently as possible. When using an RTOS let the cs8900 interrupt

472

wake a high priority task that services your driver using a binary

473

semaphore or event flag. Some drivers might allow additional tuning

474

to match your application and network.

475

476

For a production release it is recommended to set LWIP_STATS to 0.

477

Note that speed performance isn't influenced much by simply setting

478

high values to the memory options.

479

480

For more optimization hints take a look at the lwIP wiki.

--- Zero-copy MACs

To achieve zero-copy on transmit, the data passed to the raw API must

485

remain unchanged until sent. Because the send- (or write-)functions return

486

when the packets have been enqueued for sending, data must be kept stable

487

after that, too.

488

489

This implies that *ALL* pbufs passed to send functions must *not* be reused by

490

the application unless the send function returns an error indicating the pbuf

491

is not sent/queued for sending.

492

493

494

Also, data passed to tcp_write without the copy-flag must not be changed until

495

sent and ACKed (check the amount of bytes marked as 'sent')!

496

497

Therefore, be careful which type of PBUF you use and if you copy TCP data

498

or not!

BadVPN – Blame information for rev 1