WebSVN – nexmon – Blame – Rev 1 – /utilities/wireshark/doc/README.wmem

1

office

1

1. Introduction

2

3

The 'wmem' memory manager is Wireshark's memory management framework, replacing

4

the old 'emem' framework which was removed in Wireshark 2.0.

5

6

In order to make memory management easier and to reduce the probability of

7

memory leaks, Wireshark provides its own memory management API. This API is

8

implemented inside epan/wmem/ and provides memory pools and functions that make

9

it easy to manage memory even in the face of exceptions (which many dissector

10

functions can raise).

11

12

Correct use of these functions will make your code faster, and greatly reduce

13

the chances that it will leak memory in exceptional cases.

14

15

Wmem was originally conceived in this email to the wireshark-dev mailing list:

16

https://www.wireshark.org/lists/wireshark-dev/201210/msg00178.html

17

18

2. Usage for Consumers

19

20

If you're writing a dissector, or other "userspace" code, then using wmem

21

should be very similar to using malloc or g_malloc or whatever else you're used

22

to. All you need to do is include the header (epan/wmem/wmem.h) and optionally

23

get a handle to a memory pool (if you want to *create* a memory pool, see the

24

section "3. Usage for Producers" below).

25

26

A memory pool is an opaque pointer to an object of type wmem_allocator_t, and

27

it is the very first parameter passed to almost every call you make to wmem.

28

Other than that parameter (and the fact that functions are prefixed wmem_)

29

usage is very similar to glib and other utility libraries. For example:

30

31

wmem_alloc(myPool, 20);

32

33

allocates 20 bytes in the pool pointed to by myPool.

34

35

2.1 Memory Pool Lifetimes

36

37

Every memory pool should have a defined lifetime, or scope, after which all the

38

memory in that pool is unconditionally freed. When you choose to allocate memory

39

in a pool, you *must* be aware of its lifetime: if the lifetime is shorter than

40

you need, your code will contain use-after-free bugs; if the lifetime is longer

41

than you need, your code may contain undetectable memory leaks. In either case,

42

the risks outweigh the benefits.

43

44

If no pool exists whose lifetime matches the lifetime of your memory, you have

45

two options: create a new pool (see section 3 of this document) or use the NULL

46

pool. Any function that takes a pointer to a wmem_allocator_t can also be passed

47

NULL instead, in which case the memory is managed manually (just like malloc or

48

g_malloc). Memory allocated like this *must* be manually passed to wmem_free()

49

in order to prevent memory leaks (however these memory leaks will at least show

50

up in valgrind). Note that passing wmem_allocated memory directly to free()

51

or g_free() is not safe; the backing type of manually managed memory may be

52

changed without warning.

53

54

2.2 Wireshark Global Pools

55

56

Dissectors that include the wmem header file will have three pools available

57

to them automatically: wmem_packet_scope(), wmem_file_scope() and

58

wmem_epan_scope();

59

60

The packet pool is scoped to the dissection of each packet, meaning that any

61

memory allocated in it will be automatically freed at the end of the current

62

packet. The file pool is similarly scoped to the dissection of each file,

63

meaning that any memory allocated in it will be automatically freed when the

64

current capture file is closed.

65

66

NB: Using these pools outside of the appropriate scope (e.g. using the packet

67

pool when there isn't a packet being dissected) will throw an assertion.

68

See the comment in epan/wmem/wmem_scopes.c for details.

69

70

The epan pool is scoped to the library's lifetime - memory allocated in it is

71

not freed until epan_cleanup() is called, which is typically but not necessarily

72

at the very end of the program.

2.3 The Pinfo Pool

Certain allocations (such as AT_STRINGZ address allocations and anything that

77

might end up being passed to add_new_data_source) need their memory to stick

78

around a little longer than the usual packet scope - basically until the

79

next packet is dissected. This is, in fact, the scope of Wireshark's pinfo

80

structure, so the pinfo struct has a 'pool' member which is a wmem pool scoped

81

to the lifetime of the pinfo struct.

2.4 API

Full documentation for each function (parameters, return values, behaviours)

86

lives (or will live) in Doxygen-format in the header files for those functions.

87

This is just an overview of which header files you should be looking at.

2.4.1 Core API

wmem_core.h

- Basic memory management functions (wmem_alloc, wmem_realloc, wmem_free).

2.4.2 Strings

wmem_strutl.h

- Utility functions for manipulating null-terminated C-style strings.

98

Functions like strdup and strdup_printf.

99

100

wmem_strbuf.h

101

- A managed string object implementation, similar to std::string in C++ or

102

GString from Glib.

103

104

2.4.3 Container Data Structures

105

106

wmem_array.h

107

- A growable array (AKA vector) implementation.

108

109

wmem_list.h

110

- A doubly-linked list implementation.

111

112

wmem_map.h

113

- A hash map (AKA hash table) implementation.

114

115

wmem_queue.h

116

- A queue implementation (first-in, first-out).

117

118

wmem_stack.h

119

- A stack implementation (last-in, first-out).

120

121

wmem_tree.h

122

- A balanced binary tree (red-black tree) implementation.

123

124

2.4.4 Miscellaneous Utilities

125

126

wmem_miscutl.h

127

- Misc. utility functions like memdup.

2.5 Callbacks

WARNING: You probably don't actually need these; use them only when you're

132

sure you understand the dangers.

133

134

Sometimes (though hopefully rarely) it may be necessary to store data in a wmem

135

pool that requires additional cleanup before it is freed. For example, perhaps

136

you have a pointer to a file-handle that needs to be closed. In this case, you

137

can register a callback with the wmem_register_callback function

138

declared in wmem_user_cb.h. Every time the memory in a pool is freed, all

139

registered cleanup functions are called first.

140

141

Note that callback calling order is not defined, you cannot rely on a

142

certain callback being called before or after another.

143

144

WARNING: Manually freeing or moving memory (with wmem_free or wmem_realloc)

145

will NOT trigger any callbacks. It is an error to call either of

146

those functions on memory if you have a callback registered to deal

147

with the contents of that memory.

148

149

3. Usage for Producers

150

151

NB: If you're just writing a dissector, you probably don't need to read

152

this section.

153

154

One of the problems with the old emem framework was that there were basically

155

two allocator backends (glib and mmap) that were all mixed together in a mess

156

of if statements, environment variables and #ifdefs. In wmem the different

157

allocator backends are cleanly separated out, and it's up to the owner of the

158

pool to pick one.

159

160

3.1 Available Allocator Back-Ends

161

162

Each available allocator type has a corresponding entry in the

163

wmem_allocator_type_t enumeration defined in wmem_core.h. See the doxygen

164

comments in that header file for details on each type.

165

166

3.2 Creating a Pool

167

168

To create a pool, include the regular wmem header and call the

169

wmem_allocator_new() function with the appropriate type value.

170

For example:

171

172

#include "wmem/wmem.h"

173

174

wmem_allocator_t *myPool;

175

myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);

176

177

From here on in, you don't need to remember which type of allocator you used

178

(although allocator authors are welcome to expose additional allocator-specific

179

helper functions in their headers). The "myPool" variable can be passed around

180

and used as normal in allocation requests as described in section 2 of this

181

document.

182

183

3.3 Destroying a Pool

184

185

Regardless of which allocator you used to create a pool, it can be destroyed

186

with a call to the function wmem_destroy_allocator(). For example:

187

188

#include "wmem/wmem.h"

189

190

wmem_allocator_t *myPool;

191

192

myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);

193

194

/* Allocate some memory in myPool ... */

195

196

wmem_destroy_allocator(myPool);

197

198

Destroying a pool will free all the memory allocated in it.

3.4 Reusing a Pool

It is possible to free all the memory in a pool without destroying it,

203

allowing it to be reused later. Depending on the type of allocator, doing this

204

(by calling wmem_free_all()) can be significantly cheaper than fully destroying

205

and recreating the pool. This method is therefore recommended, especially when

206

the pool would otherwise be scoped to a single iteration of a loop. For example:

207

208

#include "wmem/wmem.h"

209

210

wmem_allocator_t *myPool;

211

212

myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);

213

for (...) {

214

215

/* Allocate some memory in myPool ... */

216

217

/* Free the memory, faster than destroying and recreating

218

the pool each time through the loop. */

219

wmem_free_all(myPool);

220

}

221

wmem_destroy_allocator(myPool);

4. Internal Design

Despite being written in Wireshark's standard C90, wmem follows a fairly

226

object-oriented design pattern. Although efficiency is always a concern, the

227

primary goals in writing wmem were maintainability and preventing memory

228

leaks.

229

230

4.1 struct _wmem_allocator_t

231

232

The heart of wmem is the _wmem_allocator_t structure defined in the

233

wmem_allocator.h header file. This structure uses C function pointers to

234

implement a common object-oriented design pattern known as an interface (also

235

known as an abstract class to those who are more familiar with C++).

236

237

Different allocator implementations can provide exactly the same interface by

238

assigning their own functions to the members of an instance of the structure.

239

The structure has eight members in three groups.

240

241

4.1.1 Implementation Details

- private_data

- type

The private_data pointer is a void pointer that the allocator implementation can

247

use to store whatever internal structures it needs. A pointer to private_data is

248

passed to almost all of the other functions that the allocator implementation

249

must define.

250

251

The type field is an enumeration of type wmem_allocator_type_t (see

252

section 3.1). Its value is set by the wmem_allocator_new() function, not

253

by the implementation-specific constructor. This field should be considered

254

read-only by the allocator implementation.

255

256

4.1.2 Consumer Functions

- walloc()

- wfree()

- wrealloc()

These function pointers should be set to functions with semantics obviously

263

similar to their standard-library namesakes. Each one takes an extra parameter

264

that is a copy of the allocator's private_data pointer.

265

266

Note that wrealloc() and wfree() are not expected to be called directly by user

267

code in most cases - they are primarily optimizations for use by data

268

structures that wmem might want to implement (it's inefficient, for example, to

269

implement a dynamically sized array without some form of realloc).

270

271

Also note that allocators do not have to handle NULL pointers or 0-length

272

requests in any way - those checks are done in an allocator-agnostic way

273

higher up in wmem. Allocator authors can assume that all incoming pointers

274

(to wrealloc and wfree) are non-NULL, and that all incoming lengths (to walloc

275

and wrealloc) are non-0.

276

277

4.1.3 Producer/Manager Functions

- free_all()

- gc()

- cleanup()

All of these functions take only one parameter, which is the allocator's

284

private_data pointer.

285

286

The free_all() function should free all the memory currently allocated in the

287

pool. Note that this is not necessarily exactly the same as calling free()

288

on all the allocated blocks - free_all() is allowed to do additional cleanup

289

or to make use of optimizations not available when freeing one block at a time.

290

291

The gc() function should do whatever it can to reduce excess memory usage in

292

the dissector by returning unused blocks to the OS, optimizing internal data

293

structures, etc.

294

295

The cleanup() function should do any final cleanup and free any and all memory.

296

It is basically the equivalent of a destructor function. For simplicity, wmem

297

is guaranteed to call free_all() immediately before calling this function. There

298

is no such guarantee that gc() has (ever) been called.

299

300

4.2 Pool-Agnostic API

301

302

One of the issues with emem was that the API (including the public data

303

structures) required wrapper functions for each scope implemented. Even

304

if there was a stack implementation in emem, it wasn't necessarily available

305

for use with file-scope memory unless someone took the time to write se_stack_

306

wrapper functions for the interface.

307

308

In wmem, all public APIs take the pool as the first argument, so that they can

309

be written once and used with any available memory pool. Data structures like

310

wmem's stack implementation only take the pool when created - the provided

311

pointer is stored internally with the data structure, and subsequent calls

312

(like push and pop) will take the stack itself instead of the pool.

4.3 Debugging

The primary debugging control for wmem is the WIRESHARK_DEBUG_WMEM_OVERRIDE

317

environment variable. If set, this value forces all calls to

318

wmem_allocator_new() to return the same type of allocator, regardless of which

319

type is requested normally by the code. It currently has three valid values:

320

321

- The value "simple" forces the use of WMEM_ALLOCATOR_SIMPLE. The valgrind

322

script currently sets this value, since the simple allocator is the only

323

one whose memory allocations are trackable properly by valgrind.

324

325

- The value "strict" forces the use of WMEM_ALLOCATOR_STRICT. The fuzz-test

326

script currently sets this value, since the goal when fuzz-testing is to find

327

as many errors as possible.

328

329

- The value "block" forces the use of WMEM_ALLOCATOR_BLOCK. This is not

330

currently used by any scripts, but is useful for stress-testing the block

331

allocator.

332

333

- The value "block_fast" forces the use of WMEM_ALLOCATOR_BLOCK_FAST. This is

334

not currently used by any scripts, but is useful for stress-testing the fast

335

block allocator.

336

337

Note that regardless of the value of this variable, it will always be safe to

338

call allocator-specific helpers functions. They are required to be safe no-ops

339

if the allocator argument is of the wrong type.

4.4 Testing

There is a simple test suite for wmem that lives in the file wmem_test.c and

344

should get automatically built into the binary 'wmem_test' when building

345

Wireshark. It contains at least basic tests for all existing functionality.

346

The suite is run automatically by the build-bots via the shell script

347

test/test.sh which calls out to test/suite-unittests.sh.

348

349

New features added to wmem (allocators, data structures, utility

350

functions, etc.) MUST also have tests added to this suite.

351

352

The test suite could potentially use a clean-up by someone more

353

intimately familiar with Glib's testing framework, but it does the job.

354

355

5. A Note on Performance

356

357

Because of my own bad judgment, there is the persistent idea floating around

358

that wmem is somehow magically faster than other allocators in the general case.

359

This is false.

360

361

First, wmem supports multiple different allocator backends (see sections 3 and 4

362

of this document), so it is confusing and misleading to try and compare the

363

performance of "wmem" in general with another system anyways.

364

365

Second, any modern system-provided malloc already has a very clever and

366

efficient allocator algorithm that makes use of blocks, arenas and all sorts of

367

other fancy tricks. Trying to be faster than libc's allocator is generally a

368

waste of time unless you have a specific allocation pattern to optimize for.

369

370

Third, while there were historically arguments to be made for putting something

371

in front of the kernel to reduce the number of context-switches, modern libc

372

implementations should already do that. Making a dynamic library call is still

373

marginally more expensive than calling a locally-defined linker-optimized

374

function, but it's a difference too small to care about.

375

376

With all that said, it is true that *some* of wmem's allocators can be

377

substantially faster than your standard libc malloc, in *some* use cases:

378

- The BLOCK and BLOCK_FAST allocators both provide very efficient free_all

379

operations, which can be many orders of magnitude faster than calling free()

380

on each individual allocation.

381

- The BLOCK_FAST allocator in particular is optimized for Wireshark's packet

382

scope pool. It has an extremely short, well-defined lifetime, and a very

383

regular pattern of allocations; I was able to use that knowledge to beat libc

384

rather handily, *in that specific use case*.

385

386

/*

387

* Editor modelines - https://www.wireshark.org/tools/modelines.html

388

*

389

* Local variables:

390

* c-basic-offset: 4

391

* tab-width: 8

392

* indent-tabs-mode: nil

393

* End:

394

*

395

* vi: set shiftwidth=4 tabstop=8 expandtab:

396

* :indentSize=4:tabSize=8:noTabs=true:

397

*/

nexmon – Blame information for rev 1