Updated: 2022/Sep/29
Please read Privacy Policy. It's for your privacy.
UVM(9)                     Kernel Developer's Manual                    UVM(9)
NAME
     uvm - virtual memory system external interface
SYNOPSIS
     #include <sys/param.h>
     #include <uvm/uvm.h>
DESCRIPTION
     The UVM virtual memory system manages access to the computer's memory
     resources.  User processes and the kernel access these resources through
     UVM's external interface.  UVM's external interface includes functions
     that:
     -   initialize UVM sub-systems
     -   manage virtual address spaces
     -   resolve page faults
     -   memory map files and devices
     -   perform uio-based I/O to virtual memory
     -   allocate and free kernel virtual memory
     -   allocate and free physical memory
     In addition to exporting these services, UVM has two kernel-level
     processes: pagedaemon and swapper.  The pagedaemon process sleeps until
     physical memory becomes scarce.  When that happens, pagedaemon is awoken.
     It scans physical memory, paging out and freeing memory that has not been
     recently used.  The swapper process swaps in runnable processes that are
     currently swapped out, if there is room.
     There are also several miscellaneous functions.
INITIALIZATION
     void
     uvm_init(void);
     void
     uvm_init_limits(struct lwp *l);
     void
     uvm_setpagesize(void);
     void
     uvm_swap_init(void);
     uvm_init() sets up the UVM system at system boot time, after the console
     has been setup.  It initializes global state, the page, map, kernel
     virtual memory state, machine-dependent physical map, kernel memory
     allocator, pager and anonymous memory sub-systems, and then enables
     paging of kernel objects.
     uvm_init_limits() initializes process limits for the named process.  This
     is for use by the system startup for process zero, before any other
     processes are created.
     uvm_md_init() does early boot initialization.  This currently includes:
     uvm_setpagesize() which initializes the uvmexp members pagesize (if not
     already done by machine-dependent code), pageshift and pagemask.
     uvm_physseg_init() which initialises the uvm_hotplug(9) subsystem.  It
     should be called by machine-dependent code early in the pmap_init() call
     (see pmap(9)).
     uvm_swap_init() initializes the swap sub-system.
VIRTUAL ADDRESS SPACE MANAGEMENT
     See uvm_map(9).
PAGE FAULT HANDLING
     int
     uvm_fault(struct vm_map *orig_map, vaddr_t vaddr, vm_prot_t access_type);
     uvm_fault() is the main entry point for faults.  It takes orig_map as the
     map the fault originated in, a vaddr offset into the map the fault
     occurred, and access_type describing the type of access requested.
     uvm_fault() returns a standard UVM return value.
MEMORY MAPPING FILES AND DEVICES
     See ubc(9).
VIRTUAL MEMORY I/O
     int
     uvm_io(struct vm_map *map, struct uio *uio);
     uvm_io() performs the I/O described in uio on the memory described in
     map.
ALLOCATION OF KERNEL MEMORY
     See uvm_km(9).
ALLOCATION OF PHYSICAL MEMORY
     struct vm_page *
     uvm_pagealloc(struct uvm_object *uobj, voff_t off, struct vm_anon *anon,
     int flags);
     void
     uvm_pagerealloc(struct vm_page *pg, struct uvm_object *newobj, voff_t
     newoff);
     void
     uvm_pagefree(struct vm_page *pg);
     int
     uvm_pglistalloc(psize_t size, paddr_t low, paddr_t high, paddr_t
     alignment, paddr_t boundary, struct pglist *rlist, int nsegs, int
     waitok);
     void
     uvm_pglistfree(struct pglist *list);
     void
     uvm_page_physload(paddr_t start, paddr_t end, paddr_t avail_start,
     paddr_t avail_end, int free_list);
     uvm_pagealloc() allocates a page of memory at virtual address off in
     either the object uobj or the anonymous memory anon, which must be locked
     by the caller.  Only one of uobj and anon can be non NULL.  Returns NULL
     when no page can be found.  The flags can be any of
     #define UVM_PGA_USERESERVE      0x0001  /* ok to use reserve pages */
     #define UVM_PGA_ZERO            0x0002  /* returned page must be zero'd */
     UVM_PGA_USERESERVE means to allocate a page even if that will result in
     the number of free pages being lower than uvmexp.reserve_pagedaemon (if
     the current thread is the pagedaemon) or uvmexp.reserve_kernel (if the
     current thread is not the pagedaemon).  UVM_PGA_ZERO causes the returned
     page to be filled with zeroes, either by allocating it from a pool of
     pre-zeroed pages or by zeroing it in-line as necessary.
     uvm_pagerealloc() reallocates page pg to a new object newobj, at a new
     offset newoff.
     uvm_pagefree() frees the physical page pg.  If the content of the page is
     known to be zero-filled, caller should set PG_ZERO in pg->flags so that
     the page allocator will use the page to serve future UVM_PGA_ZERO
     requests efficiently.
     uvm_pglistalloc() allocates a list of pages for size size byte under
     various constraints.  low and high describe the lowest and highest
     addresses acceptable for the list.  If alignment is non-zero, it
     describes the required alignment of the list, in power-of-two notation.
     If boundary is non-zero, no segment of the list may cross this power-of-
     two boundary, relative to zero.  nsegs is the maximum number of
     physically contiguous segments.  If waitok is non-zero, the function may
     sleep until enough memory is available.  (It also may give up in some
     situations, so a non-zero waitok does not imply that uvm_pglistalloc()
     cannot return an error.)  The allocated memory is returned in the rlist
     list; the caller has to provide storage only, the list is initialized by
     uvm_pglistalloc().
     uvm_pglistfree() frees the list of pages pointed to by list.  If the
     content of the page is known to be zero-filled, caller should set PG_ZERO
     in pg->flags so that the page allocator will use the page to serve future
     UVM_PGA_ZERO requests efficiently.
     uvm_page_physload() loads physical memory segments into VM space on the
     specified free_list.  It must be called at system boot time to set up
     physical memory management pages.  The arguments describe the start and
     end of the physical addresses of the segment, and the available start and
     end addresses of pages not already in use.  If a system has memory banks
     of different speeds the slower memory should be given a higher free_list
     value.
PROCESSES
     void
     uvm_pageout(void);
     void
     uvm_scheduler(void);
     uvm_pageout() is the main loop for the page daemon.
     uvm_scheduler() is the process zero main loop, which is to be called
     after the system has finished starting other processes.  It handles the
     swapping in of runnable, swapped out processes in priority order.
PAGE LOAN
     int
     uvm_loan(struct vm_map *map, vaddr_t start, vsize_t len, void *v, int
     flags);
     void
     uvm_unloan(void *v, int npages, int flags);
     uvm_loan() loans pages in a map out to anons or to the kernel.  map
     should be unlocked, start and len should be multiples of PAGE_SIZE.
     Argument flags should be one of
     #define UVM_LOAN_TOANON       0x01    /* loan to anons */
     #define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
     v should be pointer to array of pointers to struct anon or struct
     vm_page, as appropriate.  The caller has to allocate memory for the array
     and ensure it's big enough to hold len / PAGE_SIZE pointers.  Returns 0
     for success, or appropriate error number otherwise.  Note that wired
     pages can't be loaned out and uvm_loan() will fail in that case.
     uvm_unloan() kills loans on pages or anons.  The v must point to the
     array of pointers initialized by previous call to uvm_loan().  npages
     should match number of pages allocated for loan, this also matches number
     of items in the array.  Argument flags should be one of
     #define UVM_LOAN_TOANON       0x01    /* loan to anons */
     #define UVM_LOAN_TOPAGE       0x02    /* loan to kernel */
     and should match what was used for previous call to uvm_loan().
MISCELLANEOUS FUNCTIONS
     struct uvm_object *
     uao_create(vsize_t size, int flags);
     void
     uao_detach(struct uvm_object *uobj);
     void
     uao_reference(struct uvm_object *uobj);
     bool
     uvm_chgkprot(void *addr, size_t len, int rw);
     void
     uvm_kernacc(void *addr, size_t len, int rw);
     int
     uvm_vslock(struct vmspace *vs, void *addr, size_t len, vm_prot_t prot);
     void
     uvm_vsunlock(struct vmspace *vs, void *addr, size_t len);
     void
     uvm_meter(void);
     void
     uvm_proc_fork(struct proc *p1, struct proc *p2, bool shared);
     int
     uvm_grow(struct proc *p, vaddr_t sp);
     void
     uvn_findpages(struct uvm_object *uobj, voff_t offset, int *npagesp,
     struct vm_page **pps, int flags);
     void
     uvm_vnp_setsize(struct vnode *vp, voff_t newsize);
     The uao_create(), uao_detach(), and uao_reference() functions operate on
     anonymous memory objects, such as those used to support System V shared
     memory.  uao_create() returns an object of size size with flags:
     #define UAO_FLAG_KERNOBJ        0x1     /* create kernel object */
     #define UAO_FLAG_KERNSWAP       0x2     /* enable kernel swap */
     which can only be used once each at system boot time.  uao_reference()
     creates an additional reference to the named anonymous memory object.
     uao_detach() removes a reference from the named anonymous memory object,
     destroying it if removing the last reference.
     uvm_chgkprot() changes the protection of kernel memory from addr to addr
     + len to the value of rw.  This is primarily useful for debuggers, for
     setting breakpoints.  This function is only available with options KGDB.
     uvm_kernacc() checks the access at address addr to addr + len for rw
     access in the kernel address space.
     uvm_vslock() and uvm_vsunlock() control the wiring and unwiring of pages
     for process p from addr to addr + len.  These functions are normally used
     to wire memory for I/O.
     uvm_meter() calculates the load average.
     uvm_proc_fork() forks a virtual address space for process' (old) p1 and
     (new) p2.  If the shared argument is non zero, p1 shares its address
     space with p2, otherwise a new address space is created.  This function
     currently has no return value, and thus cannot fail.  In the future, this
     function will be changed to allow it to fail in low memory conditions.
     uvm_grow() increases the stack segment of process p to include sp.
     uvn_findpages() looks up or creates pages in uobj at offset offset, marks
     them busy and returns them in the pps array.  Currently uobj must be a
     vnode object.  The number of pages requested is pointed to by npagesp,
     and this value is updated with the actual number of pages returned.  The
     flags can be any bitwise inclusive-or of:
         UFP_ALL             Zero pseudo-flag meaning return all pages.
         UFP_NOWAIT          Don't sleep -- yield NULL for busy pages or for
                             uncached pages for which allocation would sleep.
         UFP_NOALLOC         Don't allocate -- yield NULL for uncached pages.
         UFP_NOCACHE         Don't use cached pages -- yield NULL instead.
         UFP_NORDONLY        Don't yield read-only pages -- yield NULL for
                             pages marked PG_READONLY.
         UFP_DIRTYONLY       Don't yield clean pages -- stop early at the
                             first clean one.  As a side effect, mark yielded
                             dirty pages clean.  Caller must write them to
                             permanent storage before unbusying.
         UFP_BACKWARD        Traverse pages in reverse order.  If
                             uvn_findpages() returns early, it will have
                             filled *npagesp entries at the end of pps rather
                             than the beginning.
     uvm_vnp_setsize() sets the size of vnode vp to newsize.  Caller must hold
     a reference to the vnode.  If the vnode shrinks, pages no longer used are
     discarded.
MISCELLANEOUS MACROS
     paddr_t
     atop(paddr_t pa);
     paddr_t
     ptoa(paddr_t pn);
     paddr_t
     round_page(address);
     paddr_t
     trunc_page(address);
     The atop() macro converts a physical address pa into a page number.  The
     ptoa() macro does the opposite by converting a page number pn into a
     physical address.
     round_page() and trunc_page() macros return a page address boundary from
     rounding address up and down, respectively, to the nearest page boundary.
     These macros work for either addresses or byte counts.
SYSCTL
     UVM provides support for the CTL_VM domain of the sysctl(3) hierarchy.
     It handles the VM_LOADAVG, VM_METER, VM_UVMEXP, and VM_UVMEXP2 nodes,
     which return the current load averages, calculates current VM totals,
     returns the uvmexp structure, and a kernel version independent view of
     the uvmexp structure, respectively.  It also exports a number of tunables
     that control how much VM space is allowed to be consumed by various
     tasks.  The load averages are typically accessed from userland using the
     getloadavg(3) function.  The uvmexp structure has all global state of the
     UVM system, and has the following members:
     /* vm_page constants */
     int pagesize;   /* size of a page (PAGE_SIZE): must be power of 2 */
     int pagemask;   /* page mask */
     int pageshift;  /* page shift */
     /* vm_page counters */
     int npages;     /* number of pages we manage */
     int free;       /* number of free pages */
     int paging;     /* number of pages in the process of being paged out */
     int wired;      /* number of wired pages */
     int reserve_pagedaemon; /* number of pages reserved for pagedaemon */
     int reserve_kernel; /* number of pages reserved for kernel */
     /* pageout params */
     int freemin;    /* min number of free pages */
     int freetarg;   /* target number of free pages */
     int inactarg;   /* target number of inactive pages */
     int wiredmax;   /* max number of wired pages */
     /* swap */
     int nswapdev;   /* number of configured swap devices in system */
     int swpages;    /* number of PAGE_SIZE'ed swap pages */
     int swpginuse;  /* number of swap pages in use */
     int nswget;     /* number of times fault calls uvm_swap_get() */
     int nanon;      /* number total of anon's in system */
     int nfreeanon;  /* number of free anon's */
     /* stat counters */
     int faults;             /* page fault count */
     int traps;              /* trap count */
     int intrs;              /* interrupt count */
     int swtch;              /* context switch count */
     int softs;              /* software interrupt count */
     int syscalls;           /* system calls */
     int pageins;            /* pagein operation count */
                             /* pageouts are in pdpageouts below */
     int pgswapin;           /* pages swapped in */
     int pgswapout;          /* pages swapped out */
     int forks;              /* forks */
     int forks_ppwait;       /* forks where parent waits */
     int forks_sharevm;      /* forks where vmspace is shared */
     /* fault subcounters */
     int fltnoram;   /* number of times fault was out of ram */
     int fltnoanon;  /* number of times fault was out of anons */
     int fltpgwait;  /* number of times fault had to wait on a page */
     int fltpgrele;  /* number of times fault found a released page */
     int fltrelck;   /* number of times fault relock called */
     int fltrelckok; /* number of times fault relock is a success */
     int fltanget;   /* number of times fault gets anon page */
     int fltanretry; /* number of times fault retrys an anon get */
     int fltamcopy;  /* number of times fault clears "needs copy" */
     int fltnamap;   /* number of times fault maps a neighbor anon page */
     int fltnomap;   /* number of times fault maps a neighbor obj page */
     int fltlget;    /* number of times fault does a locked pgo_get */
     int fltget;     /* number of times fault does an unlocked get */
     int flt_anon;   /* number of times fault anon (case 1a) */
     int flt_acow;   /* number of times fault anon cow (case 1b) */
     int flt_obj;    /* number of times fault is on object page (2a) */
     int flt_prcopy; /* number of times fault promotes with copy (2b) */
     int flt_przero; /* number of times fault promotes with zerofill (2b) */
     /* daemon counters */
     int pdwoke;     /* number of times daemon woke up */
     int pdrevs;     /* number of times daemon rev'd clock hand */
     int pdfreed;    /* number of pages daemon freed since boot */
     int pdscans;    /* number of pages daemon scanned since boot */
     int pdanscan;   /* number of anonymous pages scanned by daemon */
     int pdobscan;   /* number of object pages scanned by daemon */
     int pdreact;    /* number of pages daemon reactivated since boot */
     int pdbusy;     /* number of times daemon found a busy page */
     int pdpageouts; /* number of times daemon started a pageout */
     int pdpending;  /* number of times daemon got a pending pageout */
     int pddeact;    /* number of pages daemon deactivates */
NOTES
     uvm_chgkprot() is only available if the kernel has been compiled with
     options KGDB.
     All structure and types whose names begin with "vm_" will be renamed to
     "uvm_".
SEE ALSO
     swapctl(2), getloadavg(3), kvm(3), sysctl(3), ddb(4), options(4),
     memoryallocators(9), pmap(9), ubc(9), uvm_km(9), uvm_map(9)
     Charles D. Cranor and Gurudatta M. Parulkar, "The UVM Virtual Memory
     System", Proceedings of the USENIX Annual Technical Conference, USENIX
     Association,
     http://www.usenix.org/event/usenix99/full_papers/cranor/cranor.pdf,
     117-130, June 6-11, 1999.
HISTORY
     UVM is a new VM system developed at Washington University in St. Louis
     (Missouri).  UVM's roots lie partly in the Mach-based 4.4BSD VM system,
     the FreeBSD VM system, and the SunOS 4 VM system.  UVM's basic structure
     is based on the 4.4BSD VM system.  UVM's new anonymous memory system is
     based on the anonymous memory system found in the SunOS 4 VM (as
     described in papers published by Sun Microsystems, Inc.).  UVM also
     includes a number of features new to BSD including page loanout, map
     entry passing, simplified copy-on-write, and clustered anonymous memory
     pageout.  UVM is also further documented in an August 1998 dissertation
     by Charles D. Cranor.
     UVM appeared in NetBSD 1.4.
AUTHORS
     Charles D. Cranor <chuck@ccrc.wustl.edu> designed and implemented UVM.
     Matthew Green <mrg@eterna.com.au> wrote the swap-space management code
     and handled the logistical issues involved with merging UVM into the
     NetBSD source tree.
     Chuck Silvers <chuq@chuq.com> implemented the aobj pager, thus allowing
     UVM to support System V shared memory and process swapping.
NetBSD 10.99                    March 23, 2015                    NetBSD 10.99