[U-Boot] [PATCH 0/2] zfs: Add filesystem ZFS support

ZFS filesystem support from GRUB. Adding 'zfsload' and 'zfsls' commands for ZFS filesystem support. ZFS pool notation syntax is in the format '/POOLNAME/@/directory/directory/file', also explained in help output.
Initial revision given to GRUB is found: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Using "either version 2 of the License, or (at your option) any later version."
Jorgen Lundman (2): ZFS header files zfs: Add ZFS filesystem support
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 244 +++++ fs/Makefile | 1 + fs/{ => zfs}/Makefile | 43 +- fs/zfs/dev.c | 139 +++ fs/zfs/zfs.c | 2414 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 29 files changed, 4718 insertions(+), 16 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (52%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h

--- include/zfs/dmu.h | 119 ++++++++++++++++ include/zfs/dmu_objset.h | 43 ++++++ include/zfs/dnode.h | 80 +++++++++++ include/zfs/dsl_dataset.h | 52 +++++++ include/zfs/dsl_dir.h | 48 +++++++ include/zfs/sa_impl.h | 34 +++++ include/zfs/spa.h | 311 ++++++++++++++++++++++++++++++++++++++++++ include/zfs/uberblock_impl.h | 57 ++++++++ include/zfs/vdev_impl.h | 69 +++++++++ include/zfs/zap_impl.h | 112 +++++++++++++++ include/zfs/zap_leaf.h | 103 ++++++++++++++ include/zfs/zfs.h | 122 +++++++++++++++++ include/zfs/zfs_acl.h | 55 ++++++++ include/zfs/zfs_znode.h | 70 ++++++++++ include/zfs/zil.h | 56 ++++++++ include/zfs/zio.h | 92 +++++++++++++ include/zfs/zio_checksum.h | 49 +++++++ 17 files changed, 1472 insertions(+), 0 deletions(-) create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h
diff --git a/include/zfs/dmu.h b/include/zfs/dmu.h new file mode 100644 index 0000000..bee317e --- /dev/null +++ b/include/zfs/dmu.h @@ -0,0 +1,119 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_H +#define _SYS_DMU_H + +/* + * This file describes the interface that the DMU provides for its + * consumers. + * + * The DMU also interacts with the SPA. That interface is described in + * dmu_spa.h. + */ +typedef enum dmu_object_type { + DMU_OT_NONE, + /* general: */ + DMU_OT_OBJECT_DIRECTORY, /* ZAP */ + DMU_OT_OBJECT_ARRAY, /* UINT64 */ + DMU_OT_PACKED_NVLIST, /* UINT8 (XDR by nvlist_pack/unpack) */ + DMU_OT_PACKED_NVLIST_SIZE, /* UINT64 */ + DMU_OT_BPLIST, /* UINT64 */ + DMU_OT_BPLIST_HDR, /* UINT64 */ + /* spa: */ + DMU_OT_SPACE_MAP_HEADER, /* UINT64 */ + DMU_OT_SPACE_MAP, /* UINT64 */ + /* zil: */ + DMU_OT_INTENT_LOG, /* UINT64 */ + /* dmu: */ + DMU_OT_DNODE, /* DNODE */ + DMU_OT_OBJSET, /* OBJSET */ + /* dsl: */ + DMU_OT_DSL_DIR, /* UINT64 */ + DMU_OT_DSL_DIR_CHILD_MAP, /* ZAP */ + DMU_OT_DSL_DS_SNAP_MAP, /* ZAP */ + DMU_OT_DSL_PROPS, /* ZAP */ + DMU_OT_DSL_DATASET, /* UINT64 */ + /* zpl: */ + DMU_OT_ZNODE, /* ZNODE */ + DMU_OT_OLDACL, /* OLD ACL */ + DMU_OT_PLAIN_FILE_CONTENTS, /* UINT8 */ + DMU_OT_DIRECTORY_CONTENTS, /* ZAP */ + DMU_OT_MASTER_NODE, /* ZAP */ + DMU_OT_UNLINKED_SET, /* ZAP */ + /* zvol: */ + DMU_OT_ZVOL, /* UINT8 */ + DMU_OT_ZVOL_PROP, /* ZAP */ + /* other; for testing only! */ + DMU_OT_PLAIN_OTHER, /* UINT8 */ + DMU_OT_UINT64_OTHER, /* UINT64 */ + DMU_OT_ZAP_OTHER, /* ZAP */ + /* new object types: */ + DMU_OT_ERROR_LOG, /* ZAP */ + DMU_OT_SPA_HISTORY, /* UINT8 */ + DMU_OT_SPA_HISTORY_OFFSETS, /* spa_his_phys_t */ + DMU_OT_POOL_PROPS, /* ZAP */ + DMU_OT_DSL_PERMS, /* ZAP */ + DMU_OT_ACL, /* ACL */ + DMU_OT_SYSACL, /* SYSACL */ + DMU_OT_FUID, /* FUID table (Packed NVLIST UINT8) */ + DMU_OT_FUID_SIZE, /* FUID table size UINT64 */ + DMU_OT_NEXT_CLONES, /* ZAP */ + DMU_OT_SCRUB_QUEUE, /* ZAP */ + DMU_OT_USERGROUP_USED, /* ZAP */ + DMU_OT_USERGROUP_QUOTA, /* ZAP */ + DMU_OT_USERREFS, /* ZAP */ + DMU_OT_DDT_ZAP, /* ZAP */ + DMU_OT_DDT_STATS, /* ZAP */ + DMU_OT_SA, /* System attr */ + DMU_OT_SA_MASTER_NODE, /* ZAP */ + DMU_OT_SA_ATTR_REGISTRATION, /* ZAP */ + DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ + DMU_OT_NUMTYPES +} dmu_object_type_t; + +typedef enum dmu_objset_type { + DMU_OST_NONE, + DMU_OST_META, + DMU_OST_ZFS, + DMU_OST_ZVOL, + DMU_OST_OTHER, /* For testing only! */ + DMU_OST_ANY, /* Be careful! */ + DMU_OST_NUMTYPES +} dmu_objset_type_t; + +/* + * The names of zap entries in the DIRECTORY_OBJECT of the MOS. + */ +#define DMU_POOL_DIRECTORY_OBJECT 1 +#define DMU_POOL_CONFIG "config" +#define DMU_POOL_ROOT_DATASET "root_dataset" +#define DMU_POOL_SYNC_BPLIST "sync_bplist" +#define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" +#define DMU_POOL_ERRLOG_LAST "errlog_last" +#define DMU_POOL_SPARES "spares" +#define DMU_POOL_DEFLATE "deflate" +#define DMU_POOL_HISTORY "history" +#define DMU_POOL_PROPS "pool_props" +#define DMU_POOL_L2CACHE "l2cache" + +#endif /* _SYS_DMU_H */ diff --git a/include/zfs/dmu_objset.h b/include/zfs/dmu_objset.h new file mode 100644 index 0000000..176cad7 --- /dev/null +++ b/include/zfs/dmu_objset.h @@ -0,0 +1,43 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * Copyright (C) 2010 Robert Millan rmh@gnu.org + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_OBJSET_H +#define _SYS_DMU_OBJSET_H + +#include <zfs/zil.h> + +#define OBJSET_PHYS_SIZE 2048 +#define OBJSET_PHYS_SIZE_V14 1024 + +typedef struct objset_phys { + dnode_phys_t os_meta_dnode; + zil_header_t os_zil_header; + uint64_t os_type; + uint64_t os_flags; + char os_pad[OBJSET_PHYS_SIZE - sizeof(dnode_phys_t)*3 - + sizeof(zil_header_t) - sizeof(uint64_t)*2]; + dnode_phys_t os_userused_dnode; + dnode_phys_t os_groupused_dnode; +} objset_phys_t; + +#endif /* _SYS_DMU_OBJSET_H */ diff --git a/include/zfs/dnode.h b/include/zfs/dnode.h new file mode 100644 index 0000000..9ec3d43 --- /dev/null +++ b/include/zfs/dnode.h @@ -0,0 +1,80 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DNODE_H +#define _SYS_DNODE_H + +#include <zfs/spa.h> + +/* + * Fixed constants. + */ +#define DNODE_SHIFT 9 /* 512 bytes */ +#define DN_MIN_INDBLKSHIFT 10 /* 1k */ +#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DNODE_BLOCK_SHIFT 14 /* 16k */ +#define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ +#define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ +#define DN_MAX_OFFSET_SHIFT 64 /* 2^64 bytes in a dnode */ + +/* + * Derived constants. + */ +#define DNODE_SIZE (1 << DNODE_SHIFT) +#define DN_MAX_NBLKPTR ((DNODE_SIZE - DNODE_CORE_SIZE) >> SPA_BLKPTRSHIFT) +#define DN_MAX_BONUSLEN (DNODE_SIZE - DNODE_CORE_SIZE - (1 << SPA_BLKPTRSHIFT)) +#define DN_MAX_OBJECT (1ULL << DN_MAX_OBJECT_SHIFT) + +#define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) +#define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) +#define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) + +#define DNODE_FLAG_SPILL_BLKPTR (1<<2) + +#define DN_BONUS(dnp) ((void *)((dnp)->dn_bonus + \ + (((dnp)->dn_nblkptr - 1) * sizeof(blkptr_t)))) + +typedef struct dnode_phys { + uint8_t dn_type; /* dmu_object_type_t */ + uint8_t dn_indblkshift; /* ln2(indirect block size) */ + uint8_t dn_nlevels; /* 1=dn_blkptr->data blocks */ + uint8_t dn_nblkptr; /* length of dn_blkptr */ + uint8_t dn_bonustype; /* type of data in bonus buffer */ + uint8_t dn_checksum; /* ZIO_CHECKSUM type */ + uint8_t dn_compress; /* ZIO_COMPRESS type */ + uint8_t dn_flags; /* DNODE_FLAG_* */ + uint16_t dn_datablkszsec; /* data block size in 512b sectors */ + uint16_t dn_bonuslen; /* length of dn_bonus */ + uint8_t dn_pad2[4]; + + /* accounting is protected by dn_dirty_mtx */ + uint64_t dn_maxblkid; /* largest allocated block ID */ + uint64_t dn_used; /* bytes (or sectors) of disk space */ + + uint64_t dn_pad3[4]; + + blkptr_t dn_blkptr[1]; + uint8_t dn_bonus[DN_MAX_BONUSLEN - sizeof(blkptr_t)]; + blkptr_t dn_spill; +} dnode_phys_t; + +#endif /* _SYS_DNODE_H */ diff --git a/include/zfs/dsl_dataset.h b/include/zfs/dsl_dataset.h new file mode 100644 index 0000000..c6de7ab --- /dev/null +++ b/include/zfs/dsl_dataset.h @@ -0,0 +1,52 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DATASET_H +#define _SYS_DSL_DATASET_H + +typedef struct dsl_dataset_phys { + uint64_t ds_dir_obj; + uint64_t ds_prev_snap_obj; + uint64_t ds_prev_snap_txg; + uint64_t ds_next_snap_obj; + uint64_t ds_snapnames_zapobj; /* zap obj of snaps; ==0 for snaps */ + uint64_t ds_num_children; /* clone/snap children; ==0 for head */ + uint64_t ds_creation_time; /* seconds since 1970 */ + uint64_t ds_creation_txg; + uint64_t ds_deadlist_obj; + uint64_t ds_used_bytes; + uint64_t ds_compressed_bytes; + uint64_t ds_uncompressed_bytes; + uint64_t ds_unique_bytes; /* only relevant to snapshots */ + /* + * The ds_fsid_guid is a 56-bit ID that can change to avoid + * collisions. The ds_guid is a 64-bit ID that will never + * change, so there is a small probability that it will collide. + */ + uint64_t ds_fsid_guid; + uint64_t ds_guid; + uint64_t ds_flags; + blkptr_t ds_bp; + uint64_t ds_pad[8]; /* pad out to 320 bytes for good measure */ +} dsl_dataset_phys_t; + +#endif /* _SYS_DSL_DATASET_H */ diff --git a/include/zfs/dsl_dir.h b/include/zfs/dsl_dir.h new file mode 100644 index 0000000..c04e0b6 --- /dev/null +++ b/include/zfs/dsl_dir.h @@ -0,0 +1,48 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DIR_H +#define _SYS_DSL_DIR_H + +typedef struct dsl_dir_phys { + uint64_t dd_creation_time; /* not actually used */ + uint64_t dd_head_dataset_obj; + uint64_t dd_parent_obj; + uint64_t dd_clone_parent_obj; + uint64_t dd_child_dir_zapobj; + /* + * how much space our children are accounting for; for leaf + * datasets, == physical space used by fs + snaps + */ + uint64_t dd_used_bytes; + uint64_t dd_compressed_bytes; + uint64_t dd_uncompressed_bytes; + /* Administrative quota setting */ + uint64_t dd_quota; + /* Administrative reservation setting */ + uint64_t dd_reserved; + uint64_t dd_props_zapobj; + uint64_t dd_deleg_zapobj; /* dataset permissions */ + uint64_t dd_pad[20]; /* pad out to 256 bytes for good measure */ +} dsl_dir_phys_t; + +#endif /* _SYS_DSL_DIR_H */ diff --git a/include/zfs/sa_impl.h b/include/zfs/sa_impl.h new file mode 100644 index 0000000..4ec49fe --- /dev/null +++ b/include/zfs/sa_impl.h @@ -0,0 +1,34 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ +#ifndef _SYS_SA_IMPL_H +#define _SYS_SA_IMPL_H + +typedef struct sa_hdr_phys { + uint32_t sa_magic; + uint16_t sa_layout_info; + uint16_t sa_lengths[1]; +} sa_hdr_phys_t; + +#define SA_HDR_SIZE(hdr) BF32_GET_SB(hdr->sa_layout_info, 10, 16, 3, 0) +#define SA_SIZE_OFFSET 0x8 + +#endif /* _SYS_SA_IMPL_H */ diff --git a/include/zfs/spa.h b/include/zfs/spa.h new file mode 100644 index 0000000..100e2a6 --- /dev/null +++ b/include/zfs/spa.h @@ -0,0 +1,311 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2010 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef GRUB_ZFS_SPA_HEADER +#define GRUB_ZFS_SPA_HEADER 1 + +typedef enum grub_zfs_endian { + UNKNOWN_ENDIAN = -2, + LITTLE_ENDIAN = -1, + BIG_ENDIAN = 0 +} grub_zfs_endian_t; + + +#define grub_zfs_to_cpu16(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu16(x) \ + : grub_le_to_cpu16(x)) +#define grub_cpu_to_zfs16(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be16(x) \ + : grub_cpu_to_le16(x)) + +#define grub_zfs_to_cpu32(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu32(x) \ + : grub_le_to_cpu32(x)) +#define grub_cpu_to_zfs32(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be32(x) \ + : grub_cpu_to_le32(x)) + +#define grub_zfs_to_cpu64(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu64(x) \ + : grub_le_to_cpu64(x)) +#define grub_cpu_to_zfs64(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be64(x) \ + : grub_cpu_to_le64(x)) + +/* + * General-purpose 32-bit and 64-bit bitfield encodings. + */ +#define BF32_DECODE(x, low, len) P2PHASE((x) >> (low), 1U << (len)) +#define BF64_DECODE(x, low, len) P2PHASE((x) >> (low), 1ULL << (len)) +#define BF32_ENCODE(x, low, len) (P2PHASE((x), 1U << (len)) << (low)) +#define BF64_ENCODE(x, low, len) (P2PHASE((x), 1ULL << (len)) << (low)) + +#define BF32_GET(x, low, len) BF32_DECODE(x, low, len) +#define BF64_GET(x, low, len) BF64_DECODE(x, low, len) + +#define BF32_SET(x, low, len, val) \ + ((x) ^= BF32_ENCODE((x >> low) ^ (val), low, len)) +#define BF64_SET(x, low, len, val) \ + ((x) ^= BF64_ENCODE((x >> low) ^ (val), low, len)) + +#define BF32_GET_SB(x, low, len, shift, bias) \ + ((BF32_GET(x, low, len) + (bias)) << (shift)) +#define BF64_GET_SB(x, low, len, shift, bias) \ + ((BF64_GET(x, low, len) + (bias)) << (shift)) + +#define BF32_SET_SB(x, low, len, shift, bias, val) \ + BF32_SET(x, low, len, ((val) >> (shift)) - (bias)) +#define BF64_SET_SB(x, low, len, shift, bias, val) \ + BF64_SET(x, low, len, ((val) >> (shift)) - (bias)) + +/* + * We currently support nine block sizes, from 512 bytes to 128K. + * We could go higher, but the benefits are near-zero and the cost + * of COWing a giant block to modify one byte would become excessive. + */ +#define SPA_MINBLOCKSHIFT 9 +#define SPA_MAXBLOCKSHIFT 17 +#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT) +#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT) + +#define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) + +/* + * Size of block to hold the configuration data (a packed nvlist) + */ +#define SPA_CONFIG_BLOCKSIZE (1 << 14) + +/* + * The DVA size encodings for LSIZE and PSIZE support blocks up to 32MB. + * The ASIZE encoding should be at least 64 times larger (6 more bits) + * to support up to 4-way RAID-Z mirror mode with worst-case gang block + * overhead, three DVAs per bp, plus one more bit in case we do anything + * else that expands the ASIZE. + */ +#define SPA_LSIZEBITS 16 /* LSIZE up to 32M (2^16 * 512) */ +#define SPA_PSIZEBITS 16 /* PSIZE up to 32M (2^16 * 512) */ +#define SPA_ASIZEBITS 24 /* ASIZE up to 64 times larger */ + +/* + * All SPA data is represented by 128-bit data virtual addresses (DVAs). + * The members of the dva_t should be considered opaque outside the SPA. + */ +typedef struct dva { + uint64_t dva_word[2]; +} dva_t; + +/* + * Each block has a 256-bit checksum -- strong enough for cryptographic hashes. + */ +typedef struct zio_cksum { + uint64_t zc_word[4]; +} zio_cksum_t; + +/* + * Each block is described by its DVAs, time of birth, checksum, etc. + * The word-by-word, bit-by-bit layout of the blkptr is as follows: + * + * 64 56 48 40 32 24 16 8 0 + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 0 | vdev1 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 1 |G| offset1 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 2 | vdev2 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 3 |G| offset2 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 4 | vdev3 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 5 |G| offset3 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 6 |BDX|lvl| type | cksum | comp | PSIZE | LSIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 7 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 8 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 9 | physical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * a | logical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * b | fill count | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * c | checksum[0] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * d | checksum[1] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * e | checksum[2] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * f | checksum[3] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * + * Legend: + * + * vdev virtual device ID + * offset offset into virtual device + * LSIZE logical size + * PSIZE physical size (after compression) + * ASIZE allocated size (including RAID-Z parity and gang block headers) + * GRID RAID-Z layout information (reserved for future use) + * cksum checksum function + * comp compression function + * G gang block indicator + * B byteorder (endianness) + * D dedup + * X unused + * lvl level of indirection + * type DMU object type + * phys birth txg of block allocation; zero if same as logical birth txg + * log. birth transaction group in which the block was logically born + * fill count number of non-zero blocks under this bp + * checksum[4] 256-bit checksum of the data this bp describes + */ +#define SPA_BLKPTRSHIFT 7 /* blkptr_t is 128 bytes */ +#define SPA_DVAS_PER_BP 3 /* Number of DVAs in a bp */ + +typedef struct blkptr { + dva_t blk_dva[SPA_DVAS_PER_BP]; /* Data Virtual Addresses */ + uint64_t blk_prop; /* size, compression, type, etc */ + uint64_t blk_pad[2]; /* Extra space for the future */ + uint64_t blk_phys_birth; /* txg when block was allocated */ + uint64_t blk_birth; /* transaction group at birth */ + uint64_t blk_fill; /* fill count */ + zio_cksum_t blk_cksum; /* 256-bit checksum */ +} blkptr_t; + +/* + * Macros to get and set fields in a bp or DVA. + */ +#define DVA_GET_ASIZE(dva) \ + BF64_GET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0) +#define DVA_SET_ASIZE(dva, x) \ + BF64_SET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0, x) + +#define DVA_GET_GRID(dva) BF64_GET((dva)->dva_word[0], 24, 8) +#define DVA_SET_GRID(dva, x) BF64_SET((dva)->dva_word[0], 24, 8, x) + +#define DVA_GET_VDEV(dva) BF64_GET((dva)->dva_word[0], 32, 32) +#define DVA_SET_VDEV(dva, x) BF64_SET((dva)->dva_word[0], 32, 32, x) + +#define DVA_GET_GANG(dva) BF64_GET((dva)->dva_word[1], 63, 1) +#define DVA_SET_GANG(dva, x) BF64_SET((dva)->dva_word[1], 63, 1, x) + +#define BP_GET_LSIZE(bp) \ + BF64_GET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1) +#define BP_SET_LSIZE(bp, x) \ + BF64_SET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1, x) + +#define BP_GET_COMPRESS(bp) BF64_GET((bp)->blk_prop, 32, 8) +#define BP_SET_COMPRESS(bp, x) BF64_SET((bp)->blk_prop, 32, 8, x) + +#define BP_GET_CHECKSUM(bp) BF64_GET((bp)->blk_prop, 40, 8) +#define BP_SET_CHECKSUM(bp, x) BF64_SET((bp)->blk_prop, 40, 8, x) + +#define BP_GET_TYPE(bp) BF64_GET((bp)->blk_prop, 48, 8) +#define BP_SET_TYPE(bp, x) BF64_SET((bp)->blk_prop, 48, 8, x) + +#define BP_GET_LEVEL(bp) BF64_GET((bp)->blk_prop, 56, 5) +#define BP_SET_LEVEL(bp, x) BF64_SET((bp)->blk_prop, 56, 5, x) + +#define BP_GET_PROP_BIT_61(bp) BF64_GET((bp)->blk_prop, 61, 1) +#define BP_SET_PROP_BIT_61(bp, x) BF64_SET((bp)->blk_prop, 61, 1, x) + +#define BP_GET_DEDUP(bp) BF64_GET((bp)->blk_prop, 62, 1) +#define BP_SET_DEDUP(bp, x) BF64_SET((bp)->blk_prop, 62, 1, x) + +#define BP_GET_BYTEORDER(bp) (0 - BF64_GET((bp)->blk_prop, 63, 1)) +#define BP_SET_BYTEORDER(bp, x) BF64_SET((bp)->blk_prop, 63, 1, x) + +#define BP_PHYSICAL_BIRTH(bp) \ + ((bp)->blk_phys_birth ? (bp)->blk_phys_birth : (bp)->blk_birth) + +#define BP_SET_BIRTH(bp, logical, physical) \ + { \ + (bp)->blk_birth = (logical); \ + (bp)->blk_phys_birth = ((logical) == (physical) ? 0 : (physical)); \ + } + +#define BP_GET_ASIZE(bp) \ + (DVA_GET_ASIZE(&(bp)->blk_dva[0]) + DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_GET_UCSIZE(bp) \ + ((BP_GET_LEVEL(bp) > 0 || dmu_ot[BP_GET_TYPE(bp)].ot_metadata) ? \ + BP_GET_PSIZE(bp) : BP_GET_LSIZE(bp)); + +#define BP_GET_NDVAS(bp) \ + (!!DVA_GET_ASIZE(&(bp)->blk_dva[0]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_COUNT_GANG(bp) \ + (DVA_GET_GANG(&(bp)->blk_dva[0]) + \ + DVA_GET_GANG(&(bp)->blk_dva[1]) + \ + DVA_GET_GANG(&(bp)->blk_dva[2])) + +#define DVA_EQUAL(dva1, dva2) \ + ((dva1)->dva_word[1] == (dva2)->dva_word[1] && \ + (dva1)->dva_word[0] == (dva2)->dva_word[0]) + +#define BP_EQUAL(bp1, bp2) \ + (BP_PHYSICAL_BIRTH(bp1) == BP_PHYSICAL_BIRTH(bp2) && \ + DVA_EQUAL(&(bp1)->blk_dva[0], &(bp2)->blk_dva[0]) && \ + DVA_EQUAL(&(bp1)->blk_dva[1], &(bp2)->blk_dva[1]) && \ + DVA_EQUAL(&(bp1)->blk_dva[2], &(bp2)->blk_dva[2])) + +#define ZIO_CHECKSUM_EQUAL(zc1, zc2) \ + (0 == (((zc1).zc_word[0] - (zc2).zc_word[0]) | \ + ((zc1).zc_word[1] - (zc2).zc_word[1]) | \ + ((zc1).zc_word[2] - (zc2).zc_word[2]) | \ + ((zc1).zc_word[3] - (zc2).zc_word[3]))) + +#define DVA_IS_VALID(dva) (DVA_GET_ASIZE(dva) != 0) + +#define ZIO_SET_CHECKSUM(zcp, w0, w1, w2, w3) \ + { \ + (zcp)->zc_word[0] = w0; \ + (zcp)->zc_word[1] = w1; \ + (zcp)->zc_word[2] = w2; \ + (zcp)->zc_word[3] = w3; \ + } + +#define BP_IDENTITY(bp) (&(bp)->blk_dva[0]) +#define BP_IS_GANG(bp) DVA_GET_GANG(BP_IDENTITY(bp)) +#define BP_IS_HOLE(bp) ((bp)->blk_birth == 0) + +/* BP_IS_RAIDZ(bp) assumes no block compression */ +#define BP_IS_RAIDZ(bp) (DVA_GET_ASIZE(&(bp)->blk_dva[0]) > \ + BP_GET_PSIZE(bp)) + +#define BP_ZERO(bp) \ + { \ + (bp)->blk_dva[0].dva_word[0] = 0; \ + (bp)->blk_dva[0].dva_word[1] = 0; \ + (bp)->blk_dva[1].dva_word[0] = 0; \ + (bp)->blk_dva[1].dva_word[1] = 0; \ + (bp)->blk_dva[2].dva_word[0] = 0; \ + (bp)->blk_dva[2].dva_word[1] = 0; \ + (bp)->blk_prop = 0; \ + (bp)->blk_pad[0] = 0; \ + (bp)->blk_pad[1] = 0; \ + (bp)->blk_phys_birth = 0; \ + (bp)->blk_birth = 0; \ + (bp)->blk_fill = 0; \ + ZIO_SET_CHECKSUM(&(bp)->blk_cksum, 0, 0, 0, 0); \ + } + +#define BP_SPRINTF_LEN 320 + +#endif /* ! GRUB_ZFS_SPA_HEADER */ diff --git a/include/zfs/uberblock_impl.h b/include/zfs/uberblock_impl.h new file mode 100644 index 0000000..12daf98 --- /dev/null +++ b/include/zfs/uberblock_impl.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_UBERBLOCK_IMPL_H +#define _SYS_UBERBLOCK_IMPL_H + +#define UBMAX(a, b) ((a) > (b) ? (a) : (b)) + +/* + * The uberblock version is incremented whenever an incompatible on-disk + * format change is made to the SPA, DMU, or ZAP. + * + * Note: the first two fields should never be moved. When a storage pool + * is opened, the uberblock must be read off the disk before the version + * can be checked. If the ub_version field is moved, we may not detect + * version mismatch. If the ub_magic field is moved, applications that + * expect the magic number in the first word won't work. + */ +#define UBERBLOCK_MAGIC 0x00bab10c /* oo-ba-bloc! */ +#define UBERBLOCK_SHIFT 10 /* up to 1K */ + +typedef struct uberblock { + uint64_t ub_magic; /* UBERBLOCK_MAGIC */ + uint64_t ub_version; /* ZFS_VERSION */ + uint64_t ub_txg; /* txg of last sync */ + uint64_t ub_guid_sum; /* sum of all vdev guids */ + uint64_t ub_timestamp; /* UTC time of last sync */ + blkptr_t ub_rootbp; /* MOS objset_phys_t */ +} uberblock_t; + +#define VDEV_UBERBLOCK_SHIFT(as) UBMAX(as, UBERBLOCK_SHIFT) +#define UBERBLOCK_SIZE(as) (1ULL << VDEV_UBERBLOCK_SHIFT(as)) + +/* Number of uberblocks that can fit in the ring at a given ashift */ +#define UBERBLOCK_COUNT(as) (VDEV_UBERBLOCK_RING >> VDEV_UBERBLOCK_SHIFT(as)) + +#endif /* _SYS_UBERBLOCK_IMPL_H */ diff --git a/include/zfs/vdev_impl.h b/include/zfs/vdev_impl.h new file mode 100644 index 0000000..97033c9 --- /dev/null +++ b/include/zfs/vdev_impl.h @@ -0,0 +1,69 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_VDEV_IMPL_H +#define _SYS_VDEV_IMPL_H + +#define VDEV_SKIP_SIZE (8 << 10) +#define VDEV_BOOT_HEADER_SIZE (8 << 10) +#define VDEV_PHYS_SIZE (112 << 10) +#define VDEV_UBERBLOCK_RING (128 << 10) + +/* ZFS boot block */ +#define VDEV_BOOT_MAGIC 0x2f5b007b10cULL +#define VDEV_BOOT_VERSION 1 /* version number */ + +typedef struct vdev_boot_header { + uint64_t vb_magic; /* VDEV_BOOT_MAGIC */ + uint64_t vb_version; /* VDEV_BOOT_VERSION */ + uint64_t vb_offset; /* start offset (bytes) */ + uint64_t vb_size; /* size (bytes) */ + char vb_pad[VDEV_BOOT_HEADER_SIZE - 4 * sizeof(uint64_t)]; +} vdev_boot_header_t; + +typedef struct vdev_phys { + char vp_nvlist[VDEV_PHYS_SIZE - sizeof(zio_eck_t)]; + zio_eck_t vp_zbt; +} vdev_phys_t; + +typedef struct vdev_label { + char vl_pad[VDEV_SKIP_SIZE]; /* 8K */ + vdev_boot_header_t vl_boot_header; /* 8K */ + vdev_phys_t vl_vdev_phys; /* 112K */ + char vl_uberblock[VDEV_UBERBLOCK_RING]; /* 128K */ +} vdev_label_t; /* 256K total */ + +/* + * Size and offset of embedded boot loader region on each label. + * The total size of the first two labels plus the boot area is 4MB. + */ +#define VDEV_BOOT_OFFSET (2 * sizeof(vdev_label_t)) +#define VDEV_BOOT_SIZE (7ULL << 19) /* 3.5M */ + +/* + * Size of label regions at the start and end of each leaf device. + */ +#define VDEV_LABEL_START_SIZE (2 * sizeof(vdev_label_t) + VDEV_BOOT_SIZE) +#define VDEV_LABEL_END_SIZE (2 * sizeof(vdev_label_t)) +#define VDEV_LABELS 4 + +#endif /* _SYS_VDEV_IMPL_H */ diff --git a/include/zfs/zap_impl.h b/include/zfs/zap_impl.h new file mode 100644 index 0000000..65e9311 --- /dev/null +++ b/include/zfs/zap_impl.h @@ -0,0 +1,112 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_IMPL_H +#define _SYS_ZAP_IMPL_H + +#define ZAP_MAGIC 0x2F52AB2ABULL + +#define ZAP_HASHBITS 28 +#define MZAP_ENT_LEN 64 +#define MZAP_NAME_LEN (MZAP_ENT_LEN - 8 - 4 - 2) +#define MZAP_MAX_BLKSHIFT SPA_MAXBLOCKSHIFT +#define MZAP_MAX_BLKSZ (1 << MZAP_MAX_BLKSHIFT) + +typedef struct mzap_ent_phys { + uint64_t mze_value; + uint32_t mze_cd; + uint16_t mze_pad; /* in case we want to chain them someday */ + char mze_name[MZAP_NAME_LEN]; +} mzap_ent_phys_t; + +typedef struct mzap_phys { + uint64_t mz_block_type; /* ZBT_MICRO */ + uint64_t mz_salt; + uint64_t mz_pad[6]; + mzap_ent_phys_t mz_chunk[1]; + /* actually variable size depending on block size */ +} mzap_phys_t; + +/* + * The (fat) zap is stored in one object. It is an array of + * 1<<FZAP_BLOCK_SHIFT byte blocks. The layout looks like one of: + * + * ptrtbl fits in first block: + * [zap_phys_t zap_ptrtbl_shift < 6] [zap_leaf_t] ... + * + * ptrtbl too big for first block: + * [zap_phys_t zap_ptrtbl_shift >= 6] [zap_leaf_t] [ptrtbl] ... + * + */ + +#define ZBT_LEAF ((1ULL << 63) + 0) +#define ZBT_HEADER ((1ULL << 63) + 1) +#define ZBT_MICRO ((1ULL << 63) + 3) +/* any other values are ptrtbl blocks */ + +/* + * the embedded pointer table takes up half a block: + * block size / entry size (2^3) / 2 + */ +#define ZAP_EMBEDDED_PTRTBL_SHIFT(zap) (FZAP_BLOCK_SHIFT(zap) - 3 - 1) + +/* + * The embedded pointer table starts half-way through the block. Since + * the pointer table itself is half the block, it starts at (64-bit) + * word number (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap)). + */ +#define ZAP_EMBEDDED_PTRTBL_ENT(zap, idx) \ + ((uint64_t *)(zap)->zap_f.zap_phys) \ + [(idx) + (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap))] + +/* + * TAKE NOTE: + * If zap_phys_t is modified, zap_byteswap() must be modified. + */ +typedef struct zap_phys { + uint64_t zap_block_type; /* ZBT_HEADER */ + uint64_t zap_magic; /* ZAP_MAGIC */ + + struct zap_table_phys { + uint64_t zt_blk; /* starting block number */ + uint64_t zt_numblks; /* number of blocks */ + uint64_t zt_shift; /* bits to index it */ + uint64_t zt_nextblk; /* next (larger) copy start block */ + uint64_t zt_blks_copied; /* number source blocks copied */ + } zap_ptrtbl; + + uint64_t zap_freeblk; /* the next free block */ + uint64_t zap_num_leafs; /* number of leafs */ + uint64_t zap_num_entries; /* number of entries */ + uint64_t zap_salt; /* salt to stir into hash function */ + uint64_t zap_normflags; /* flags for u8_textprep_str() */ + uint64_t zap_flags; /* zap_flag_t */ + /* + * This structure is followed by padding, and then the embedded + * pointer table. The embedded pointer table takes up second + * half of the block. It is accessed using the + * ZAP_EMBEDDED_PTRTBL_ENT() macro. + */ +} zap_phys_t; + +#endif /* _SYS_ZAP_IMPL_H */ diff --git a/include/zfs/zap_leaf.h b/include/zfs/zap_leaf.h new file mode 100644 index 0000000..4ddddb5 --- /dev/null +++ b/include/zfs/zap_leaf.h @@ -0,0 +1,103 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_LEAF_H +#define _SYS_ZAP_LEAF_H + +#define ZAP_LEAF_MAGIC 0x2AB1EAF + +/* chunk size = 24 bytes */ +#define ZAP_LEAF_CHUNKSIZE 24 + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +typedef enum zap_chunk_type { + ZAP_CHUNK_FREE = 253, + ZAP_CHUNK_ENTRY = 252, + ZAP_CHUNK_ARRAY = 251, + ZAP_CHUNK_TYPE_MAX = 250 +} zap_chunk_type_t; + +/* + * TAKE NOTE: + * If zap_leaf_phys_t is modified, zap_leaf_byteswap() must be modified. + */ +typedef struct zap_leaf_phys { + struct zap_leaf_header { + uint64_t lh_block_type; /* ZBT_LEAF */ + uint64_t lh_pad1; + uint64_t lh_prefix; /* hash prefix of this leaf */ + uint32_t lh_magic; /* ZAP_LEAF_MAGIC */ + uint16_t lh_nfree; /* number free chunks */ + uint16_t lh_nentries; /* number of entries */ + uint16_t lh_prefix_len; /* num bits used to id this */ + + /* above is accessable to zap, below is zap_leaf private */ + + uint16_t lh_freelist; /* chunk head of free list */ + uint8_t lh_pad2[12]; + } l_hdr; /* 2 24-byte chunks */ + + /* + * The header is followed by a hash table with + * ZAP_LEAF_HASH_NUMENTRIES(zap) entries. The hash table is + * followed by an array of ZAP_LEAF_NUMCHUNKS(zap) + * zap_leaf_chunk structures. These structures are accessed + * with the ZAP_LEAF_CHUNK() macro. + */ + + uint16_t l_hash[1]; +} zap_leaf_phys_t; + +typedef union zap_leaf_chunk { + struct zap_leaf_entry { + uint8_t le_type; /* always ZAP_CHUNK_ENTRY */ + uint8_t le_int_size; /* size of ints */ + uint16_t le_next; /* next entry in hash chain */ + uint16_t le_name_chunk; /* first chunk of the name */ + uint16_t le_name_length; /* bytes in name, incl null */ + uint16_t le_value_chunk; /* first chunk of the value */ + uint16_t le_value_length; /* value length in ints */ + uint32_t le_cd; /* collision differentiator */ + uint64_t le_hash; /* hash value of the name */ + } l_entry; + struct zap_leaf_array { + uint8_t la_type; /* always ZAP_CHUNK_ARRAY */ + union { + uint8_t la_array[ZAP_LEAF_ARRAY_BYTES]; + uint64_t la_array64; + } __attribute__ ((packed)); + uint16_t la_next; /* next blk or CHAIN_END */ + } l_array; + struct zap_leaf_free { + uint8_t lf_type; /* always ZAP_CHUNK_FREE */ + uint8_t lf_pad[ZAP_LEAF_ARRAY_BYTES]; + uint16_t lf_next; /* next in free list, or CHAIN_END */ + } l_free; +} zap_leaf_chunk_t; + +#endif /* _SYS_ZAP_LEAF_H */ diff --git a/include/zfs/zfs.h b/include/zfs/zfs.h new file mode 100644 index 0000000..b6d41c0 --- /dev/null +++ b/include/zfs/zfs.h @@ -0,0 +1,122 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + /* + * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef GRUB_ZFS_HEADER +#define GRUB_ZFS_HEADER 1 + + +/* + * On-disk version number. + */ +#define SPA_VERSION 28ULL + +/* + * The following are configuration names used in the nvlist describing a pool's + * configuration. + */ +#define ZPOOL_CONFIG_VERSION "version" +#define ZPOOL_CONFIG_POOL_NAME "name" +#define ZPOOL_CONFIG_POOL_STATE "state" +#define ZPOOL_CONFIG_POOL_TXG "txg" +#define ZPOOL_CONFIG_POOL_GUID "pool_guid" +#define ZPOOL_CONFIG_CREATE_TXG "create_txg" +#define ZPOOL_CONFIG_TOP_GUID "top_guid" +#define ZPOOL_CONFIG_VDEV_TREE "vdev_tree" +#define ZPOOL_CONFIG_TYPE "type" +#define ZPOOL_CONFIG_CHILDREN "children" +#define ZPOOL_CONFIG_ID "id" +#define ZPOOL_CONFIG_GUID "guid" +#define ZPOOL_CONFIG_PATH "path" +#define ZPOOL_CONFIG_DEVID "devid" +#define ZPOOL_CONFIG_METASLAB_ARRAY "metaslab_array" +#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift" +#define ZPOOL_CONFIG_ASHIFT "ashift" +#define ZPOOL_CONFIG_ASIZE "asize" +#define ZPOOL_CONFIG_DTL "DTL" +#define ZPOOL_CONFIG_STATS "stats" +#define ZPOOL_CONFIG_WHOLE_DISK "whole_disk" +#define ZPOOL_CONFIG_ERRCOUNT "error_count" +#define ZPOOL_CONFIG_NOT_PRESENT "not_present" +#define ZPOOL_CONFIG_SPARES "spares" +#define ZPOOL_CONFIG_IS_SPARE "is_spare" +#define ZPOOL_CONFIG_NPARITY "nparity" +#define ZPOOL_CONFIG_PHYS_PATH "phys_path" +#define ZPOOL_CONFIG_L2CACHE "l2cache" +#define ZPOOL_CONFIG_HOLE_ARRAY "hole_array" +#define ZPOOL_CONFIG_VDEV_CHILDREN "vdev_children" +#define ZPOOL_CONFIG_IS_HOLE "is_hole" +#define ZPOOL_CONFIG_DDT_HISTOGRAM "ddt_histogram" +#define ZPOOL_CONFIG_DDT_OBJ_STATS "ddt_object_stats" +#define ZPOOL_CONFIG_DDT_STATS "ddt_stats" +/* + * The persistent vdev state is stored as separate values rather than a single + * 'vdev_state' entry. This is because a device can be in multiple states, such + * as offline and degraded. + */ +#define ZPOOL_CONFIG_OFFLINE "offline" +#define ZPOOL_CONFIG_FAULTED "faulted" +#define ZPOOL_CONFIG_DEGRADED "degraded" +#define ZPOOL_CONFIG_REMOVED "removed" + +#define VDEV_TYPE_ROOT "root" +#define VDEV_TYPE_MIRROR "mirror" +#define VDEV_TYPE_REPLACING "replacing" +#define VDEV_TYPE_RAIDZ "raidz" +#define VDEV_TYPE_DISK "disk" +#define VDEV_TYPE_FILE "file" +#define VDEV_TYPE_MISSING "missing" +#define VDEV_TYPE_HOLE "hole" +#define VDEV_TYPE_SPARE "spare" +#define VDEV_TYPE_L2CACHE "l2cache" + +/* + * pool state. The following states are written to disk as part of the normal + * SPA lifecycle: ACTIVE, EXPORTED, DESTROYED, SPARE, L2CACHE. The remaining + * states are software abstractions used at various levels to communicate pool + * state. + */ +typedef enum pool_state { + POOL_STATE_ACTIVE = 0, /* In active use */ + POOL_STATE_EXPORTED, /* Explicitly exported */ + POOL_STATE_DESTROYED, /* Explicitly destroyed */ + POOL_STATE_SPARE, /* Reserved for hot spare use */ + POOL_STATE_L2CACHE, /* Level 2 ARC device */ + POOL_STATE_UNINITIALIZED, /* Internal spa_t state */ + POOL_STATE_UNAVAIL, /* Internal libzfs state */ + POOL_STATE_POTENTIALLY_ACTIVE /* Internal libzfs state */ +} pool_state_t; + +struct grub_zfs_data; + +int grub_zfs_fetch_nvlist(device_t dev, char **nvlist); +int grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj); + +char *grub_zfs_nvlist_lookup_string(char *nvlist, char *name); +char *grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name); +int grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, + uint64_t *out); +char *grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index); +int grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name); + +#endif /* ! GRUB_ZFS_HEADER */ diff --git a/include/zfs/zfs_acl.h b/include/zfs/zfs_acl.h new file mode 100644 index 0000000..66749af --- /dev/null +++ b/include/zfs/zfs_acl.h @@ -0,0 +1,55 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ACL_H +#define _SYS_FS_ZFS_ACL_H + +typedef struct zfs_oldace { + uint32_t z_fuid; /* "who" */ + uint32_t z_access_mask; /* access mask */ + uint16_t z_flags; /* flags, i.e inheritance */ + uint16_t z_type; /* type of entry allow/deny */ +} zfs_oldace_t; + +#define ACE_SLOT_CNT 6 + +typedef struct zfs_znode_acl_v0 { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_count; /* Number of ACEs */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_pad; /* pad */ + zfs_oldace_t z_ace_data[ACE_SLOT_CNT]; /* 6 standard ACEs */ +} zfs_znode_acl_v0_t; + +#define ZFS_ACE_SPACE (sizeof(zfs_oldace_t) * ACE_SLOT_CNT) + +typedef struct zfs_znode_acl { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_size; /* Number of bytes in ACL */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_count; /* ace count */ + uint8_t z_ace_data[ZFS_ACE_SPACE]; /* space for embedded ACEs */ +} zfs_znode_acl_t; + + +#endif /* _SYS_FS_ZFS_ACL_H */ diff --git a/include/zfs/zfs_znode.h b/include/zfs/zfs_znode.h new file mode 100644 index 0000000..e3265e3 --- /dev/null +++ b/include/zfs/zfs_znode.h @@ -0,0 +1,70 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ZNODE_H +#define _SYS_FS_ZFS_ZNODE_H + +#include <zfs/zfs_acl.h> + +#define MASTER_NODE_OBJ 1 +#define ZFS_ROOT_OBJ "ROOT" +#define ZPL_VERSION_STR "VERSION" +#define ZFS_SA_ATTRS "SA_ATTRS" + +#define ZPL_VERSION 5ULL + +#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48) + +/* + * This is the persistent portion of the znode. It is stored + * in the "bonus buffer" of the file. Short symbolic links + * are also stored in the bonus buffer. + */ +typedef struct znode_phys { + uint64_t zp_atime[2]; /* 0 - last file access time */ + uint64_t zp_mtime[2]; /* 16 - last file modification time */ + uint64_t zp_ctime[2]; /* 32 - last file change time */ + uint64_t zp_crtime[2]; /* 48 - creation time */ + uint64_t zp_gen; /* 64 - generation (txg of creation) */ + uint64_t zp_mode; /* 72 - file mode bits */ + uint64_t zp_size; /* 80 - size of file */ + uint64_t zp_parent; /* 88 - directory parent (`..') */ + uint64_t zp_links; /* 96 - number of links to file */ + uint64_t zp_xattr; /* 104 - DMU object for xattrs */ + uint64_t zp_rdev; /* 112 - dev_t for VBLK & VCHR files */ + uint64_t zp_flags; /* 120 - persistent flags */ + uint64_t zp_uid; /* 128 - file owner */ + uint64_t zp_gid; /* 136 - owning group */ + uint64_t zp_pad[4]; /* 144 - future */ + zfs_znode_acl_t zp_acl; /* 176 - 263 ACL */ + /* + * Data may pad out any remaining bytes in the znode buffer, eg: + * + * |<---------------------- dnode_phys (512) ------------------------>| + * |<-- dnode (192) --->|<----------- "bonus" buffer (320) ---------->| + * |<---- znode (264) ---->|<---- data (56) ---->| + * + * At present, we only use this space to store symbolic links. + */ +} znode_phys_t; + +#endif /* _SYS_FS_ZFS_ZNODE_H */ diff --git a/include/zfs/zil.h b/include/zfs/zil.h new file mode 100644 index 0000000..bc9d5e9 --- /dev/null +++ b/include/zfs/zil.h @@ -0,0 +1,56 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIL_H +#define _SYS_ZIL_H + +/* + * Intent log format: + * + * Each objset has its own intent log. The log header (zil_header_t) + * for objset N's intent log is kept in the Nth object of the SPA's + * intent_log objset. The log header points to a chain of log blocks, + * each of which contains log records (i.e., transactions) followed by + * a log block trailer (zil_trailer_t). The format of a log record + * depends on the record (or transaction) type, but all records begin + * with a common structure that defines the type, length, and txg. + */ + +/* + * Intent log header - this on disk structure holds fields to manage + * the log. All fields are 64 bit to easily handle cross architectures. + */ +typedef struct zil_header { + uint64_t zh_claim_txg; /* txg in which log blocks were claimed */ + uint64_t zh_replay_seq; /* highest replayed sequence number */ + blkptr_t zh_log; /* log chain */ + uint64_t zh_claim_seq; /* highest claimed sequence number */ + uint64_t zh_flags; /* header flags */ + uint64_t zh_pad[4]; +} zil_header_t; + +/* + * zh_flags bit settings + */ +#define ZIL_REPLAY_NEEDED 0x1 /* replay needed - internal only */ + +#endif /* _SYS_ZIL_H */ diff --git a/include/zfs/zio.h b/include/zfs/zio.h new file mode 100644 index 0000000..38f90d5 --- /dev/null +++ b/include/zfs/zio.h @@ -0,0 +1,92 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _ZIO_H +#define _ZIO_H + +#include <zfs/spa.h> + +#define ZEC_MAGIC 0x210da7ab10c7a11ULL /* zio data bloc tail */ + +typedef struct zio_eck { + uint64_t zec_magic; /* for validation, endianness */ + zio_cksum_t zec_cksum; /* 256-bit checksum */ +} zio_eck_t; + +/* + * Gang block headers are self-checksumming and contain an array + * of block pointers. + */ +#define SPA_GANGBLOCKSIZE SPA_MINBLOCKSIZE +#define SPA_GBH_NBLKPTRS ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t)) / sizeof(blkptr_t)) +#define SPA_GBH_FILLER ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t) - \ + (SPA_GBH_NBLKPTRS * sizeof(blkptr_t))) /\ + sizeof(uint64_t)) + +#define ZIO_GET_IOSIZE(zio) \ + (BP_IS_GANG((zio)->io_bp) ? \ + SPA_GANGBLOCKSIZE : BP_GET_PSIZE((zio)->io_bp)) + +typedef struct zio_gbh { + blkptr_t zg_blkptr[SPA_GBH_NBLKPTRS]; + uint64_t zg_filler[SPA_GBH_FILLER]; + zio_eck_t zg_tail; +} zio_gbh_phys_t; + +enum zio_checksum { + ZIO_CHECKSUM_INHERIT = 0, + ZIO_CHECKSUM_ON, + ZIO_CHECKSUM_OFF, + ZIO_CHECKSUM_LABEL, + ZIO_CHECKSUM_GANG_HEADER, + ZIO_CHECKSUM_ZILOG, + ZIO_CHECKSUM_FLETCHER_2, + ZIO_CHECKSUM_FLETCHER_4, + ZIO_CHECKSUM_SHA256, + ZIO_CHECKSUM_ZILOG2, + ZIO_CHECKSUM_FUNCTIONS +}; + +#define ZIO_CHECKSUM_ON_VALUE ZIO_CHECKSUM_FLETCHER_2 +#define ZIO_CHECKSUM_DEFAULT ZIO_CHECKSUM_ON + +enum zio_compress { + ZIO_COMPRESS_INHERIT = 0, + ZIO_COMPRESS_ON, + ZIO_COMPRESS_OFF, + ZIO_COMPRESS_LZJB, + ZIO_COMPRESS_EMPTY, + ZIO_COMPRESS_GZIP1, + ZIO_COMPRESS_GZIP2, + ZIO_COMPRESS_GZIP3, + ZIO_COMPRESS_GZIP4, + ZIO_COMPRESS_GZIP5, + ZIO_COMPRESS_GZIP6, + ZIO_COMPRESS_GZIP7, + ZIO_COMPRESS_GZIP8, + ZIO_COMPRESS_GZIP9, + ZIO_COMPRESS_FUNCTIONS +}; + +#endif /* _ZIO_H */ diff --git a/include/zfs/zio_checksum.h b/include/zfs/zio_checksum.h new file mode 100644 index 0000000..8ade44a --- /dev/null +++ b/include/zfs/zio_checksum.h @@ -0,0 +1,49 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIO_CHECKSUM_H +#define _SYS_ZIO_CHECKSUM_H + +/* + * Signature for checksum functions. + */ +typedef void zio_checksum_t(const void *data, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp); + +/* + * Information about each checksum function. + */ +typedef struct zio_checksum_info { + zio_checksum_t *ci_func; /* checksum function for each byteorder */ + int ci_correctable; /* number of correctable bits */ + int ci_eck; /* uses zio embedded checksum? */ + char *ci_name; /* descriptive name */ +} zio_checksum_info_t; + +extern void zio_checksum_SHA256(const void *, uint64_t, + grub_zfs_endian_t endian, zio_cksum_t *); +extern void fletcher_2(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); +extern void fletcher_4(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); + +#endif /* _SYS_ZIO_CHECKSUM_H */

commit bc192bb0716b02b2b711dc2df62ed15e1160ea50 Author: Jorgen Lundman lundman@lundman.net Date: Wed May 23 01:55:02 2012 +0000
typedef changes to decompress
commit bb131ecb4d8e0f3cdc0658457193bb246086bd85 Author: Jorgen Lundman lundman@lundman.net Date: Wed May 23 01:22:17 2012 +0000
Minor style changes
commit 4faac896fc0bdd7cf4eb4669ce9d4a706d8c19ac Author: Jorgen Lundman lundman@lundman.net Date: Wed May 23 00:23:26 2012 +0000
Style fixes
commit 5fdd1858f1b578067d2e063f2ec88a72f89c5925 Author: Jorgen Lundman lundman@lundman.net Date: Wed May 23 00:07:28 2012 +0000
ZFS
commit 22f35810e150eef0e0a0ce024fca8e7a53b93e71 Author: Jorgen Lundman lundman@lundman.net Date: Tue May 22 08:00:30 2012 +0000
code changes
commit 91652ac002d0e78b3a70d5cbcb3a3aef63273c66 Author: Jorgen Lundman lundman@lundman.net Date: Tue May 22 07:22:55 2012 +0000
code syntax
commit ed20af0a540dba68b8d11b7c069fc51d14d1a195 Author: Jorgen Lundman lundman@lundman.net Date: Tue May 22 06:50:42 2012 +0000
tabify
commit 7ce43d7c3621f8ca9d55bd6ce190bd7663a04a29 Author: Jorgen Lundman lundman@lundman.net Date: Tue May 22 06:38:23 2012 +0000
Coding style changes
commit 0adcf5d595044a06dfaa948cab6db3a4a880f5a9 Author: Jorgen Lundman lundman@lundman.net Date: Mon May 21 08:42:48 2012 +0000
importing ZFS
commit 258052be456f8b3de18d389fb8ac215eb87f7aeb Author: Jorgen Lundman lundman@lundman.net Date: Thu May 10 07:31:04 2012 +0000
Remove local debug file
commit 9e22da44e013963f9dac13e6da9d134ae8f4e164 Author: Jorgen Lundman lundman@lundman.net Date: Thu May 10 05:46:50 2012 +0000
Missing header files
commit 0428d7116788ea1e9375cbdef7ba357199930e76 Author: Jorgen Lundman lundman@lundman.net Date: Thu May 10 05:30:47 2012 +0000
Update
commit 766e5ecb8db0cbb6cd073d6ce32b03f51f249062 Author: Jorgen Lundman lundman@lundman.net Date: Thu May 10 05:19:52 2012 +0000
ZFS changes
commit bea9588d98f52d95a325f3b71a7ae448242c7b64 Author: Jorgen Lundman lundman@lundman.net Date: Thu May 10 05:11:03 2012 +0000
Adding ZFS --- Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 244 +++++ fs/Makefile | 1 + fs/{ => zfs}/Makefile | 43 +- fs/zfs/dev.c | 139 +++ fs/zfs/zfs.c | 2414 ++++++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs_common.h | 94 ++ 12 files changed, 3246 insertions(+), 16 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (52%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 351a8f0..d3b84bf 100644 --- a/Makefile +++ b/Makefile @@ -244,7 +244,7 @@ endif LIBS += arch/$(ARCH)/lib/lib$(ARCH).o LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \ fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \ - fs/ubifs/libubifs.o + fs/ubifs/libubifs.o fs/zfs/libzfs.o LIBS += net/libnet.o LIBS += disk/libdisk.o LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 6e23baa..181a9ad 100644 --- a/common/Makefile +++ b/common/Makefile @@ -90,6 +90,7 @@ COBJS-$(CONFIG_CMD_ELF) += cmd_elf.o COBJS-$(CONFIG_SYS_HUSH_PARSER) += cmd_exit.o COBJS-$(CONFIG_CMD_EXT2) += cmd_ext2.o COBJS-$(CONFIG_CMD_FAT) += cmd_fat.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_FDC)$(CONFIG_CMD_FDOS) += cmd_fdc.o COBJS-$(CONFIG_OF_LIBFDT) += cmd_fdt.o fdt_support.o COBJS-$(CONFIG_CMD_FDOS) += cmd_fdos.o diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..99c4318 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,244 @@ +/* + * + * ZFS filesystem implementation in Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + * + */ + + +/* + * Changelog: + * 0.1 - The Epoch + * - lundman + */ + +#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h> + +#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif + +#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif + +#define DOS_PART_MAGIC_OFFSET 0x1fe +#define DOS_FS_TYPE_OFFSET 0x36 +#define DOS_FS32_TYPE_OFFSET 0x52 + +static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc, + char *argv[]) +{ + char *filename = NULL; + char *ep; + int dev; + unsigned long part = 1; + ulong addr = 0; + ulong part_length; + disk_partition_t info; + char buf[12]; + unsigned long count; + const char *addr_str; + struct zfs_file zfile; + struct device_s vdev; + + if (argc < 3) + return CMD_RET_USAGE; + + count = 0; + addr = simple_strtoul(argv[3], NULL, 16); + filename = getenv("bootfile"); + switch (argc) { + case 3: + addr_str = getenv("loadaddr"); + if (addr_str != NULL) + addr = simple_strtoul(addr_str, NULL, 16); + else + addr = CONFIG_SYS_LOAD_ADDR; + + break; + case 4: + break; + case 5: + filename = argv[4]; + break; + case 6: + filename = argv[4]; + count = simple_strtoul(argv[5], NULL, 16); + break; + + default: + return cmd_usage(cmdtp); + } + + if (!filename) { + puts("** No boot file defined **\n"); + return 1; + } + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + if (zfs_dev_desc == NULL) { + printf("** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (part != 0) { + if (get_partition_info(zfs_dev_desc, part, &info)) { + printf("** Bad partition %lu **\n", part); + return 1; + } + + if (strncmp((char *)info.type, BOOT_PART_TYPE, + strlen(BOOT_PART_TYPE)) != 0) { + printf("** Invalid partition type "%s" (expect "" BOOT_PART_TYPE "")\n", + info.type); + return 1; + } + printf("Loading file "%s" " + "from %s device %d:%lu %s\n", + filename, argv[1], dev, part, info.name); + } else { + printf("Loading file "%s" from %s device %d\n", + filename, argv[1], dev); + } + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("**Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + memset(&zfile, 0, sizeof(zfile)); + zfile.device = &vdev; + if (zfs_open(&zfile, filename)) { + printf("** File not found %s\n", filename); + return 1; + } + + if ((count < zfile.size) && (count != 0)) + zfile.size = (uint64_t)count; + + if (zfs_read(&zfile, (char *)addr, zfile.size) != zfile.size) { + printf("** Unable to read "%s" from %s %d:%lu **\n", + filename, argv[1], dev, part); + zfs_close(&zfile); + return 1; + } + + zfs_close(&zfile); + + /* Loading ok, update default load address */ + load_addr = addr; + + printf("%llu bytes read\n", zfile.size); + sprintf(buf, "%llX", zfile.size); + setenv("filesize", buf); + + return 0; +} + + +int zfs_print(const char *entry, const struct zfs_dirhook_info *data) +{ + printf("%s %s\n", + data->dir ? "<DIR> " : " ", + entry); + return 0; /* 0 continue, 1 stop */ +} + + + +static int do_zfs_ls(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + const char *filename = "/"; + int dev; + unsigned long part = 1; + char *ep; + int part_length; + struct device_s vdev; + + if (argc < 3) + return cmd_usage(cmdtp); + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + + if (zfs_dev_desc == NULL) { + printf("\n** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("\n** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (argc == 4) + filename = argv[3]; + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("** Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + zfs_ls(&vdev, filename, + zfs_print); + + return 0; +} + + +U_BOOT_CMD(zfsls, 4, 1, do_zfs_ls, + "list files in a directory (default /)", + "<interface> <dev[:part]> [directory]\n" + " - list files from 'dev' on 'interface' in a '/DATASET/@/$dir/'"); + +U_BOOT_CMD(zfsload, 6, 0, do_zfs_load, + "load binary file from a ZFS filesystem", + "<interface> <dev[:part]> [addr] [filename] [bytes]\n" + " - load binary file '/DATASET/@/$dir/$file' from 'dev' on 'interface'\n" + " to address 'addr' from ZFS filesystem"); diff --git a/fs/Makefile b/fs/Makefile index 22aad12..b0d62c6 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -25,6 +25,7 @@ subdirs-$(CONFIG_CMD_CRAMFS) := cramfs subdirs-$(CONFIG_CMD_EXT2) += ext2 subdirs-$(CONFIG_CMD_FAT) += fat +subdirs-$(CONFIG_CMD_ZFS) += zfs subdirs-$(CONFIG_CMD_FDOS) += fdos subdirs-$(CONFIG_CMD_JFFS2) += jffs2 subdirs-$(CONFIG_CMD_REISER) += reiserfs diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 52% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..00ab9e6 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,10 @@ # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2006 +# Wolfgang Denk, DENX Software Engineering, <wd at denx.de> +# +# (C) Copyright 2003 +# Pavel Bartusek, Sysgo Real-Time Solutions AG, <pba at sysgo.de> +# # # See file CREDITS for list of people who contributed to this # project. @@ -20,19 +24,28 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, # MA 02111-1307 USA # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk + +LIB = $(obj)libzfs.o + +AOBJS = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o + +SRCS := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS := $(addprefix $(obj),$(AOBJS) $(COBJS-y)) + + +all: $(LIB) $(AOBJS) + +$(LIB): $(obj).depend $(OBJS) + $(call cmd_link_o_target, $(OBJS)) + +######################################################################### + +# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all: - @for dir in $(SUBDIRS) ; do \ - $(MAKE) -C $$dir $@ ; done +######################################################################### diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..d61ff80 --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,139 @@ +/* + * + * based on code of fs/reiserfs/dev.c by + * + * (C) Copyright 2003 - 2004 + * Sysgo AG, <www.elinos.com>, Pavel Bartusek pba@sysgo.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <config.h> +#include <zfs_common.h> + +static block_dev_desc_t *zfs_block_dev_desc; +static disk_partition_t part_info; + +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{ + zfs_block_dev_desc = rbdd; + + if (part == 0) { + /* disk doesn't use partition table */ + part_info.start = 0; + part_info.size = rbdd->lba; + part_info.blksz = rbdd->blksz; + } else { + if (get_partition_info + (zfs_block_dev_desc, part, &part_info)) { + return 0; + } + } + return part_info.size; +} + +/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{ + short sec_buffer[SECTOR_SIZE/sizeof(short)]; + char *sec_buf = sec_buffer; + unsigned block_len; + + /* + * Check partition boundaries + */ + if ((sector < 0) + || ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >= + part_info.size)) { + /* errnum = ERR_OUTSIDE_PART; */ + printf(" ** zfs_devread() read outside partition sector %d\n", sector); + return 1; + } + + /* + * Get the read to the beginning of a partition. + */ + sector += byte_offset >> SECTOR_BITS; + byte_offset &= SECTOR_SIZE - 1; + + debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len); + + if (zfs_block_dev_desc == NULL) { + printf("** Invalid Block Device Descriptor (NULL)\n"); + return 1; + } + + if (byte_offset != 0) { + /* read first part which isn't aligned with start of sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error **\n"); + return 1; + } + memcpy(buf, sec_buf + byte_offset, + min(SECTOR_SIZE - byte_offset, byte_len)); + buf += min(SECTOR_SIZE - byte_offset, byte_len); + byte_len -= min(SECTOR_SIZE - byte_offset, byte_len); + sector++; + } + + if (byte_len == 0) + return 0; + + /* read sector aligned part */ + block_len = byte_len & ~(SECTOR_SIZE - 1); + + if (block_len == 0) { + u8 p[SECTOR_SIZE]; + + block_len = SECTOR_SIZE; + zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + 1, (unsigned long *)p); + memcpy(buf, p, byte_len); + return 0; + } + + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + block_len / SECTOR_SIZE, + (unsigned long *) buf) != + block_len / SECTOR_SIZE) { + printf(" ** zfs_devread() read error - block\n"); + return 1; + } + + block_len = byte_len & ~(SECTOR_SIZE - 1); + buf += block_len; + byte_len -= block_len; + sector += block_len / SECTOR_SIZE; + + if (byte_len != 0) { + /* read rest of data which are not in whole sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error - last part\n"); + return 1; + } + memcpy(buf, sec_buf, byte_len); + } + return 0; +} diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..e7369cd --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2414 @@ +/* + * ZFS filesystem implementation in u-boot by + * Jorgen Lundman <lundman at lundman.net> + * + * ZFS-fs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + + +block_dev_desc_t *zfs_dev_desc; + +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009,2010 + * Free Software Foundation, Inc. + * Copyright 2010 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * The zfs plug-in routines for GRUB are: + * + * zfs_mount() - locates a valid uberblock of the root pool and reads + * in its MOS at the memory address MOS. + * + * zfs_open() - locates a plain file object by following the MOS + * and places its dnode at the memory address DNODE. + * + * zfs_read() - read in the data blocks pointed by the DNODE. + * + */ + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/sa_impl.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + + +#define ZPOOL_PROP_BOOTFS "bootfs" + + +/* + * For nvlist manipulation. (from nvpair.h) + */ +#define NV_ENCODE_NATIVE 0 +#define NV_ENCODE_XDR 1 +#define NV_BIG_ENDIAN 0 +#define NV_LITTLE_ENDIAN 1 +#define DATA_TYPE_UINT64 8 +#define DATA_TYPE_STRING 9 +#define DATA_TYPE_NVLIST 19 +#define DATA_TYPE_NVLIST_ARRAY 20 + + +/* + * Macros to get fields in a bp or DVA. + */ +#define P2PHASE(x, align) ((x) & ((align) - 1)) +#define DVA_OFFSET_TO_PHYS_SECTOR(offset) \ + ((offset + VDEV_LABEL_START_SIZE) >> SPA_MINBLOCKSHIFT) + +/* + * return x rounded down to an align boundary + * eg, P2ALIGN(1200, 1024) == 1024 (1*align) + * eg, P2ALIGN(1024, 1024) == 1024 (1*align) + * eg, P2ALIGN(0x1234, 0x100) == 0x1200 (0x12*align) + * eg, P2ALIGN(0x5600, 0x100) == 0x5600 (0x56*align) + */ +#define P2ALIGN(x, align) ((x) & -(align)) + +/* + * FAT ZAP data structures + */ +#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ +#define ZAP_HASH_IDX(hash, n) (((n) == 0) ? 0 : ((hash) >> (64 - (n)))) +#define CHAIN_END 0xffff /* end of the chunk chain */ + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +#define ZAP_LEAF_HASH_SHIFT(bs) (bs - 5) +#define ZAP_LEAF_HASH_NUMENTRIES(bs) (1 << ZAP_LEAF_HASH_SHIFT(bs)) +#define LEAF_HASH(bs, h) \ + ((ZAP_LEAF_HASH_NUMENTRIES(bs)-1) & \ + ((h) >> (64 - ZAP_LEAF_HASH_SHIFT(bs)-l->l_hdr.lh_prefix_len))) + +/* + * The amount of space available for chunks is: + * block size shift - hash entry size (2) * number of hash + * entries - header space (2*chunksize) + */ +#define ZAP_LEAF_NUMCHUNKS(bs) \ + (((1<<bs) - 2*ZAP_LEAF_HASH_NUMENTRIES(bs)) / \ + ZAP_LEAF_CHUNKSIZE - 2) + +/* + * The chunks start immediately after the hash table. The end of the + * hash table is at l_hash + HASH_NUMENTRIES, which we simply cast to a + * chunk_t. + */ +#define ZAP_LEAF_CHUNK(l, bs, idx) \ + ((zap_leaf_chunk_t *)(l->l_hash + ZAP_LEAF_HASH_NUMENTRIES(bs)))[idx] +#define ZAP_LEAF_ENTRY(l, bs, idx) (&ZAP_LEAF_CHUNK(l, bs, idx).l_entry) + + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + + + +typedef int zfs_decomp_func_t(void *s_start, void *d_start, + uint32_t s_len, uint32_t d_len); +typedef struct decomp_entry { + char *name; + zfs_decomp_func_t *decomp_func; +} decomp_entry_t; + +typedef struct dnode_end { + dnode_phys_t dn; + grub_zfs_endian_t endian; +} dnode_end_t; + +struct grub_zfs_data { + /* cache for a file block of the currently zfs_open()-ed file */ + char *file_buf; + uint64_t file_start; + uint64_t file_end; + + /* XXX: ashift is per vdev, not per pool. We currently only ever touch + * a single vdev, but when/if raid-z or stripes are supported, this + * may need revision. + */ + uint64_t vdev_ashift; + uint64_t label_txg; + uint64_t pool_guid; + + /* cache for a dnode block */ + dnode_phys_t *dnode_buf; + dnode_phys_t *dnode_mdn; + uint64_t dnode_start; + uint64_t dnode_end; + grub_zfs_endian_t dnode_endian; + + uberblock_t current_uberblock; + + dnode_end_t mos; + dnode_end_t mdn; + dnode_end_t dnode; + + uint64_t vdev_phys_sector; + + int (*userhook)(const char *, const struct zfs_dirhook_info *); + struct zfs_dirhook_info *dirinfo; + +}; + + + + +static int +zlib_decompress(void *s, void *d, + uint32_t slen, uint32_t dlen) +{ + if (zlib_decompress(s, d, slen, dlen) < 0) + return ZFS_ERR_BAD_FS; + return ZFS_ERR_NONE; +} + +static decomp_entry_t decomp_table[ZIO_COMPRESS_FUNCTIONS] = { + {"inherit", NULL}, /* ZIO_COMPRESS_INHERIT */ + {"on", lzjb_decompress}, /* ZIO_COMPRESS_ON */ + {"off", NULL}, /* ZIO_COMPRESS_OFF */ + {"lzjb", lzjb_decompress}, /* ZIO_COMPRESS_LZJB */ + {"empty", NULL}, /* ZIO_COMPRESS_EMPTY */ + {"gzip-1", zlib_decompress}, /* ZIO_COMPRESS_GZIP1 */ + {"gzip-2", zlib_decompress}, /* ZIO_COMPRESS_GZIP2 */ + {"gzip-3", zlib_decompress}, /* ZIO_COMPRESS_GZIP3 */ + {"gzip-4", zlib_decompress}, /* ZIO_COMPRESS_GZIP4 */ + {"gzip-5", zlib_decompress}, /* ZIO_COMPRESS_GZIP5 */ + {"gzip-6", zlib_decompress}, /* ZIO_COMPRESS_GZIP6 */ + {"gzip-7", zlib_decompress}, /* ZIO_COMPRESS_GZIP7 */ + {"gzip-8", zlib_decompress}, /* ZIO_COMPRESS_GZIP8 */ + {"gzip-9", zlib_decompress}, /* ZIO_COMPRESS_GZIP9 */ +}; + + + +static int zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, + void *buf, struct grub_zfs_data *data); + +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data); + +/* + * Our own version of log2(). Same thing as highbit()-1. + */ +static int +zfs_log2(uint64_t num) +{ + int i = 0; + + while (num > 1) { + i++; + num = num >> 1; + } + + return i; +} + + +/* Checksum Functions */ +static void +zio_checksum_off(const void *buf __attribute__ ((unused)), + uint64_t size __attribute__ ((unused)), + grub_zfs_endian_t endian __attribute__ ((unused)), + zio_cksum_t *zcp) +{ + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +/* Checksum Table and Values */ +static zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = { + {NULL, 0, 0, "inherit"}, + {NULL, 0, 0, "on"}, + {zio_checksum_off, 0, 0, "off"}, + {zio_checksum_SHA256, 1, 1, "label"}, + {zio_checksum_SHA256, 1, 1, "gang_header"}, + {NULL, 0, 0, "zilog"}, + {fletcher_2, 0, 0, "fletcher2"}, + {fletcher_4, 1, 0, "fletcher4"}, + {zio_checksum_SHA256, 1, 0, "SHA256"}, + {NULL, 0, 0, "zilog2"}, +}; + +/* + * zio_checksum_verify: Provides support for checksum verification. + * + * Fletcher2, Fletcher4, and SHA256 are supported. + * + */ +static int +zio_checksum_verify(zio_cksum_t zc, uint32_t checksum, + grub_zfs_endian_t endian, char *buf, int size) +{ + zio_eck_t *zec = (zio_eck_t *) (buf + size) - 1; + zio_checksum_info_t *ci = &zio_checksum_table[checksum]; + zio_cksum_t actual_cksum, expected_cksum; + + if (checksum >= ZIO_CHECKSUM_FUNCTIONS || ci->ci_func == NULL) { + printf("zfs unknown checksum function %d\n", checksum); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (ci->ci_eck) { + expected_cksum = zec->zec_cksum; + zec->zec_cksum = zc; + ci->ci_func(buf, size, endian, &actual_cksum); + zec->zec_cksum = expected_cksum; + zc = expected_cksum; + } else { + ci->ci_func(buf, size, endian, &actual_cksum); + } + + if ((actual_cksum.zc_word[0] != zc.zc_word[0]) + || (actual_cksum.zc_word[1] != zc.zc_word[1]) + || (actual_cksum.zc_word[2] != zc.zc_word[2]) + || (actual_cksum.zc_word[3] != zc.zc_word[3])) { + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * vdev_uberblock_compare takes two uberblock structures and returns an integer + * indicating the more recent of the two. + * Return Value = 1 if ub2 is more recent + * Return Value = -1 if ub1 is more recent + * The most recent uberblock is determined using its transaction number and + * timestamp. The uberblock with the highest transaction number is + * considered "newer". If the transaction numbers of the two blocks match, the + * timestamps are compared to determine the "newer" of the two. + */ +static int +vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +{ + grub_zfs_endian_t ub1_endian, ub2_endian; + if (grub_zfs_to_cpu64(ub1->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub1_endian = LITTLE_ENDIAN; + else + ub1_endian = BIG_ENDIAN; + if (grub_zfs_to_cpu64(ub2->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub2_endian = LITTLE_ENDIAN; + else + ub2_endian = BIG_ENDIAN; + + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return 1; + + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return 1; + + return 0; +} + +/* + * Three pieces of information are needed to verify an uberblock: the magic + * number, the version number, and the checksum. + * + * Currently Implemented: version number, magic number, label txg + * Need to Implement: checksum + * + */ +static int +uberblock_verify(uberblock_t *uber, int offset, struct grub_zfs_data *data) +{ + int err; + grub_zfs_endian_t endian = UNKNOWN_ENDIAN; + zio_cksum_t zc; + + if (uber->ub_txg < data->label_txg) { + debug("ignoring partially written label: uber_txg < label_txg %llu %llu\n", + uber->ub_txg, data->label_txg); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(uber->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) <= SPA_VERSION) + endian = LITTLE_ENDIAN; + + if (grub_zfs_to_cpu64(uber->ub_magic, BIG_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) <= SPA_VERSION) + endian = BIG_ENDIAN; + + if (endian == UNKNOWN_ENDIAN) { + printf("invalid uberblock magic\n"); + return ZFS_ERR_BAD_FS; + } + + memset(&zc, 0, sizeof(zc)); + zc.zc_word[0] = grub_cpu_to_zfs64(offset, endian); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_LABEL, endian, + (char *) uber, UBERBLOCK_SIZE(data->vdev_ashift)); + + if (!err) { + /* Check that the data pointed by the rootbp is usable. */ + void *osp = NULL; + size_t ospsize; + err = zio_read(&uber->ub_rootbp, endian, &osp, &ospsize, data); + free(osp); + + if (!err && ospsize < OBJSET_PHYS_SIZE_V14) { + printf("uberblock rootbp points to invalid data\n"); + return ZFS_ERR_BAD_FS; + } + } + + return err; +} + +/* + * Find the best uberblock. + * Return: + * Success - Pointer to the best uberblock. + * Failure - NULL + */ +static uberblock_t *find_bestub(char *ub_array, struct grub_zfs_data *data) +{ + const uint64_t sector = data->vdev_phys_sector; + uberblock_t *ubbest = NULL; + uberblock_t *ubnext; + unsigned int i, offset, pickedub = 0; + int err = ZFS_ERR_NONE; + + const unsigned int UBCOUNT = UBERBLOCK_COUNT(data->vdev_ashift); + const uint64_t UBBYTES = UBERBLOCK_SIZE(data->vdev_ashift); + + for (i = 0; i < UBCOUNT; i++) { + ubnext = (uberblock_t *) (i * UBBYTES + ub_array); + offset = (sector << SPA_MINBLOCKSHIFT) + VDEV_PHYS_SIZE + (i * UBBYTES); + + err = uberblock_verify(ubnext, offset, data); + if (err) + continue; + + if (ubbest == NULL || vdev_uberblock_compare(ubnext, ubbest) > 0) { + ubbest = ubnext; + pickedub = i; + } + } + + if (ubbest) + debug("zfs Found best uberblock at idx %d, txg %llu\n", + pickedub, (unsigned long long) ubbest->ub_txg); + + return ubbest; +} + +static inline size_t +get_psize(blkptr_t *bp, grub_zfs_endian_t endian) +{ + return (((grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 16) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT; +} + +static uint64_t +dva_get_offset(dva_t *dva, grub_zfs_endian_t endian) +{ + return grub_zfs_to_cpu64((dva)->dva_word[1], + endian) << SPA_MINBLOCKSHIFT; +} + +/* + * Read a block of data based on the gang block address dva, + * and put its data in buf. + * + */ +static int +zio_read_gang(blkptr_t *bp, grub_zfs_endian_t endian, dva_t *dva, void *buf, + struct grub_zfs_data *data) +{ + zio_gbh_phys_t *zio_gb; + uint64_t offset, sector; + unsigned i; + int err; + zio_cksum_t zc; + + memset(&zc, 0, sizeof(zc)); + + zio_gb = malloc(SPA_GANGBLOCKSIZE); + if (!zio_gb) + return ZFS_ERR_OUT_OF_MEMORY; + + offset = dva_get_offset(dva, endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + /* read in the gang block header */ + err = zfs_devread(sector, 0, SPA_GANGBLOCKSIZE, (char *) zio_gb); + + if (err) { + free(zio_gb); + return err; + } + + /* XXX */ + /* self checksuming the gang block header */ + ZIO_SET_CHECKSUM(&zc, DVA_GET_VDEV(dva), + dva_get_offset(dva, endian), bp->blk_birth, 0); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_GANG_HEADER, endian, + (char *) zio_gb, SPA_GANGBLOCKSIZE); + if (err) { + free(zio_gb); + return err; + } + + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + + for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { + if (zio_gb->zg_blkptr[i].blk_birth == 0) + continue; + + err = zio_read_data(&zio_gb->zg_blkptr[i], endian, buf, data); + if (err) { + free(zio_gb); + return err; + } + buf = (char *) buf + get_psize(&zio_gb->zg_blkptr[i], endian); + } + free(zio_gb); + return ZFS_ERR_NONE; +} + +/* + * Read in a block of raw data to buf. + */ +static int +zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, void *buf, + struct grub_zfs_data *data) +{ + int i, psize; + int err = ZFS_ERR_NONE; + + psize = get_psize(bp, endian); + + /* pick a good dva from the block pointer */ + for (i = 0; i < SPA_DVAS_PER_BP; i++) { + uint64_t offset, sector; + + if (bp->blk_dva[i].dva_word[0] == 0 && bp->blk_dva[i].dva_word[1] == 0) + continue; + + if ((grub_zfs_to_cpu64(bp->blk_dva[i].dva_word[1], endian)>>63) & 1) { + err = zio_read_gang(bp, endian, &bp->blk_dva[i], buf, data); + } else { + /* read in a data block */ + offset = dva_get_offset(&bp->blk_dva[i], endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + err = zfs_devread(sector, 0, psize, buf); + } + + if (!err) { + /*Check the underlying checksum before we rule this DVA as "good"*/ + uint32_t checkalgo = (grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 40) & 0xff; + + err = zio_checksum_verify(bp->blk_cksum, checkalgo, endian, buf, psize); + if (!err) + return ZFS_ERR_NONE; + } + + /* If read failed or checksum bad, reset the error. Hopefully we've got some more DVA's to try.*/ + } + + if (!err) { + printf("couldn't find a valid DVA\n"); + err = ZFS_ERR_BAD_FS; + } + + return err; +} + +/* + * Read in a block of data, verify its checksum, decompress if needed, + * and put the uncompressed data in buf. + */ +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data) +{ + size_t lsize, psize; + unsigned int comp; + char *compbuf = NULL; + int err; + + *buf = NULL; + + comp = (grub_zfs_to_cpu64((bp)->blk_prop, endian)>>32) & 0xff; + lsize = (BP_IS_HOLE(bp) ? 0 : + (((grub_zfs_to_cpu64((bp)->blk_prop, endian) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT)); + psize = get_psize(bp, endian); + + if (size) + *size = lsize; + + if (comp >= ZIO_COMPRESS_FUNCTIONS) { + printf("compression algorithm %u not supported\n", (unsigned int) comp); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF && decomp_table[comp].decomp_func == NULL) { + printf("compression algorithm %s not supported\n", decomp_table[comp].name); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF) { + compbuf = malloc(psize); + if (!compbuf) + return ZFS_ERR_OUT_OF_MEMORY; + } else { + compbuf = *buf = malloc(lsize); + } + + err = zio_read_data(bp, endian, compbuf, data); + if (err) { + free(compbuf); + *buf = NULL; + return err; + } + + if (comp != ZIO_COMPRESS_OFF) { + *buf = malloc(lsize); + if (!*buf) { + free(compbuf); + return ZFS_ERR_OUT_OF_MEMORY; + } + + err = decomp_table[comp].decomp_func(compbuf, *buf, psize, lsize); + free(compbuf); + if (err) { + free(*buf); + *buf = NULL; + return err; + } + } + + return ZFS_ERR_NONE; +} + +/* + * Get the block from a block id. + * push the block onto the stack. + * + */ +static int +dmu_read(dnode_end_t *dn, uint64_t blkid, void **buf, + grub_zfs_endian_t *endian_out, struct grub_zfs_data *data) +{ + int idx, level; + blkptr_t *bp_array = dn->dn.dn_blkptr; + int epbs = dn->dn.dn_indblkshift - SPA_BLKPTRSHIFT; + blkptr_t *bp; + void *tmpbuf = 0; + grub_zfs_endian_t endian; + int err = ZFS_ERR_NONE; + + bp = malloc(sizeof(blkptr_t)); + if (!bp) + return ZFS_ERR_OUT_OF_MEMORY; + + endian = dn->endian; + for (level = dn->dn.dn_nlevels - 1; level >= 0; level--) { + idx = (blkid >> (epbs * level)) & ((1 << epbs) - 1); + *bp = bp_array[idx]; + if (bp_array != dn->dn.dn_blkptr) { + free(bp_array); + bp_array = 0; + } + + if (BP_IS_HOLE(bp)) { + size_t size = grub_zfs_to_cpu16(dn->dn.dn_datablkszsec, + dn->endian) + << SPA_MINBLOCKSHIFT; + *buf = malloc(size); + if (*buf) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + memset(*buf, 0, size); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + if (level == 0) { + err = zio_read(bp, endian, buf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + err = zio_read(bp, endian, &tmpbuf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + if (err) + break; + bp_array = tmpbuf; + } + if (bp_array != dn->dn.dn_blkptr) + free(bp_array); + if (endian_out) + *endian_out = endian; + + free(bp); + return err; +} + +/* + * mzap_lookup: Looks up property described by "name" and returns the value + * in "value". + */ +static int +mzap_lookup(mzap_phys_t *zapobj, grub_zfs_endian_t endian, + int objsize, char *name, uint64_t * value) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (strcmp(mzap_ent[i].mze_name, name) == 0) { + *value = grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian); + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + +static int +mzap_iterate(mzap_phys_t *zapobj, grub_zfs_endian_t endian, int objsize, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (hook(mzap_ent[i].mze_name, + grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian), + data)) + return 1; + } + + return 0; +} + +static uint64_t +zap_hash(uint64_t salt, const char *name) +{ + static uint64_t table[256]; + const uint8_t *cp; + uint8_t c; + uint64_t crc = salt; + + if (table[128] == 0) { + uint64_t *ct; + int i, j; + for (i = 0; i < 256; i++) { + for (ct = table + i, *ct = i, j = 8; j > 0; j--) + *ct = (*ct >> 1) ^ (-(*ct & 1) & ZFS_CRC64_POLY); + } + } + + for (cp = (const uint8_t *) name; (c = *cp) != '\0'; cp++) + crc = (crc >> 8) ^ table[(crc ^ c) & 0xFF]; + + /* + * Only use 28 bits, since we need 4 bits in the cookie for the + * collision differentiator. We MUST use the high bits, since + * those are the onces that we first pay attention to when + * chosing the bucket. + */ + crc &= ~((1ULL << (64 - ZAP_HASHBITS)) - 1); + + return crc; +} + +/* + * Only to be used on 8-bit arrays. + * array_len is actual len in bytes (not encoded le_value_length). + * buf is null-terminated. + */ +/* XXX */ +static int +zap_leaf_array_equal(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, int chunk, int array_len, const char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + return 0; + + if (memcmp(la->la_array, buf + bseen, toread) != 0) + break; + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return (bseen == array_len); +} + +/* XXX */ +static int +zap_leaf_array_get(zap_leaf_phys_t *l, grub_zfs_endian_t endian, int blksft, + int chunk, int array_len, char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + /* Don't use errno because this error is to be ignored. */ + return ZFS_ERR_BAD_FS; + + memcpy(buf + bseen, la->la_array, toread); + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return ZFS_ERR_NONE; +} + + +/* + * Given a zap_leaf_phys_t, walk thru the zap leaf chunks to get the + * value for the property "name". + * + */ +/* XXX */ +static int +zap_leaf_lookup(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, uint64_t h, + const char *name, uint64_t *value) +{ + uint16_t chunk; + struct zap_leaf_entry *le; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + printf("invalid leaf type\n"); + return ZFS_ERR_BAD_FS; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + printf("invalid leaf magic\n"); + return ZFS_ERR_BAD_FS; + } + + for (chunk = grub_zfs_to_cpu16(l->l_hash[LEAF_HASH(blksft, h)], endian); + chunk != CHAIN_END; chunk = le->le_next) { + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) { + printf("invalid chunk number\n"); + return ZFS_ERR_BAD_FS; + } + + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) { + printf("invalid chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(le->le_hash, endian) != h) + continue; + + if (zap_leaf_array_equal(l, endian, blksft, + grub_zfs_to_cpu16(le->le_name_chunk, endian), + grub_zfs_to_cpu16(le->le_name_length, endian), + name)) { + struct zap_leaf_array *la; + + if (le->le_int_size != 8 || le->le_value_length != 1) { + printf("invalid leaf chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + + *value = grub_be_to_cpu64(la->la_array64); + + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + + +/* Verify if this is a fat zap header block */ +static int +zap_verify(zap_phys_t *zap) +{ + if (zap->zap_magic != (uint64_t) ZAP_MAGIC) { + printf("bad ZAP magic\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_flags != 0) { + printf("bad ZAP flags\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_salt == 0) { + printf("bad ZAP salt\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Fat ZAP lookup + * + */ +/* XXX */ +static int +fzap_lookup(dnode_end_t *zap_dnode, zap_phys_t *zap, + char *name, uint64_t *value, struct grub_zfs_data *data) +{ + void *l; + uint64_t hash, idx, blkid; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t leafendian; + + err = zap_verify(zap); + if (err) + return err; + + hash = zap_hash(zap->zap_salt, name); + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + idx = ZAP_HASH_IDX(hash, zap->zap_ptrtbl.zt_shift); + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return ZFS_ERR_BAD_FS; + } + err = dmu_read(zap_dnode, blkid, &l, &leafendian, data); + if (err) + return err; + + err = zap_leaf_lookup(l, leafendian, blksft, hash, name, value); + free(l); + return err; +} + +/* XXX */ +static int +fzap_iterate(dnode_end_t *zap_dnode, zap_phys_t *zap, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + zap_leaf_phys_t *l; + void *l_in; + uint64_t idx, blkid; + uint16_t chunk; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t endian; + + if (zap_verify(zap)) + return 0; + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return 0; + } + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return 0; + } + for (idx = 0; idx < zap->zap_ptrtbl.zt_numblks; idx++) { + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + err = dmu_read(zap_dnode, blkid, &l_in, &endian, data); + l = l_in; + if (err) + continue; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + free(l); + continue; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + free(l); + continue; + } + + for (chunk = 0; chunk < ZAP_LEAF_NUMCHUNKS(blksft); chunk++) { + char *buf; + struct zap_leaf_array *la; + struct zap_leaf_entry *le; + uint64_t val; + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) + continue; + + buf = malloc(grub_zfs_to_cpu16(le->le_name_length, endian) + + 1); + if (zap_leaf_array_get(l, endian, blksft, le->le_name_chunk, + le->le_name_length, buf)) { + free(buf); + continue; + } + buf[le->le_name_length] = 0; + + if (le->le_int_size != 8 + || grub_zfs_to_cpu16(le->le_value_length, endian) != 1) + continue; + + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + val = grub_be_to_cpu64(la->la_array64); + if (hook(buf, val, data)) + return 1; + free(buf); + } + } + return 0; +} + + +/* + * Read in the data of a zap object and find the value for a matching + * property name. + * + */ +static int +zap_lookup(dnode_end_t *zap_dnode, char *name, uint64_t *val, + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return err; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + err = (mzap_lookup(zapbuf, endian, size, name, val)); + free(zapbuf); + return err; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + err = (fzap_lookup(zap_dnode, zapbuf, name, val, data)); + free(zapbuf); + return err; + } + + printf("unknown ZAP type\n"); + return ZFS_ERR_BAD_FS; +} + +static int +zap_iterate(dnode_end_t *zap_dnode, + int (*hook)(const char *name, uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + int ret; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return 0; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + ret = mzap_iterate(zapbuf, endian, size, hook, data); + free(zapbuf); + return ret; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + ret = fzap_iterate(zap_dnode, zapbuf, hook, data); + free(zapbuf); + return ret; + } + printf("unknown ZAP type\n"); + return 0; +} + + +/* + * Get the dnode of an object number from the metadnode of an object set. + * + * Input + * mdn - metadnode to get the object dnode + * objnum - object number for the object dnode + * buf - data buffer that holds the returning dnode + */ +static int +dnode_get(dnode_end_t *mdn, uint64_t objnum, uint8_t type, + dnode_end_t *buf, struct grub_zfs_data *data) +{ + uint64_t blkid, blksz; /* the block id this object dnode is in */ + int epbs; /* shift of number of dnodes in a block */ + int idx; /* index within a block */ + void *dnbuf; + int err; + grub_zfs_endian_t endian; + + blksz = grub_zfs_to_cpu16(mdn->dn.dn_datablkszsec, + mdn->endian) << SPA_MINBLOCKSHIFT; + + epbs = zfs_log2(blksz) - DNODE_SHIFT; + blkid = objnum >> epbs; + idx = objnum & ((1 << epbs) - 1); + + if (data->dnode_buf != NULL && memcmp(data->dnode_mdn, mdn, + sizeof(*mdn)) == 0 + && objnum >= data->dnode_start && objnum < data->dnode_end) { + memmove(&(buf->dn), &(data->dnode_buf)[idx], DNODE_SIZE); + buf->endian = data->dnode_endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type: %02X != %02x\n", buf->dn.dn_type, type); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; + } + + err = dmu_read(mdn, blkid, &dnbuf, &endian, data); + if (err) + return err; + + free(data->dnode_buf); + free(data->dnode_mdn); + data->dnode_mdn = malloc(sizeof(*mdn)); + if (!data->dnode_mdn) { + data->dnode_buf = 0; + } else { + memcpy(data->dnode_mdn, mdn, sizeof(*mdn)); + data->dnode_buf = dnbuf; + data->dnode_start = blkid << epbs; + data->dnode_end = (blkid + 1) << epbs; + data->dnode_endian = endian; + } + + memmove(&(buf->dn), (dnode_phys_t *) dnbuf + idx, DNODE_SIZE); + buf->endian = endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Get the file dnode for a given file name where mdn is the meta dnode + * for this ZFS object set. When found, place the file dnode in dn. + * The 'path' argument will be mangled. + * + */ +static int +dnode_get_path(dnode_end_t *mdn, const char *path_in, dnode_end_t *dn, + struct grub_zfs_data *data) +{ + uint64_t objnum, version; + char *cname, ch; + int err = ZFS_ERR_NONE; + char *path, *path_buf; + struct dnode_chain { + struct dnode_chain *next; + dnode_end_t dn; + }; + struct dnode_chain *dnode_path = 0, *dn_new, *root; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) + return ZFS_ERR_OUT_OF_MEMORY; + dn_new->next = 0; + dnode_path = root = dn_new; + + err = dnode_get(mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + err = zap_lookup(&(dnode_path->dn), ZPL_VERSION_STR, &version, data); + if (err) { + free(dn_new); + return err; + } + if (version > ZPL_VERSION) { + free(dn_new); + printf("too new ZPL version\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + err = zap_lookup(&(dnode_path->dn), ZFS_ROOT_OBJ, &objnum, data); + if (err) { + free(dn_new); + return err; + } + + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + path = path_buf = strdup(path_in); + if (!path_buf) { + free(dn_new); + return ZFS_ERR_OUT_OF_MEMORY; + } + + while (1) { + /* skip leading slashes */ + while (*path == '/') + path++; + if (!*path) + break; + /* get the next component name */ + cname = path; + while (*path && *path != '/') + path++; + /* Skip dot. */ + if (cname + 1 == path && cname[0] == '.') + continue; + /* Handle double dot. */ + if (cname + 2 == path && cname[0] == '.' && cname[1] == '.') { + if (dn_new->next) { + dn_new = dnode_path; + dnode_path = dn_new->next; + free(dn_new); + } else { + printf("can't resolve ..\n"); + err = ZFS_ERR_FILE_NOT_FOUND; + break; + } + continue; + } + + ch = *path; + *path = 0; /* ensure null termination */ + + if (dnode_path->dn.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + free(path_buf); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + err = zap_lookup(&(dnode_path->dn), cname, &objnum, data); + if (err) + break; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + dn_new->next = dnode_path; + dnode_path = dn_new; + + objnum = ZFS_DIRENT_OBJ(objnum); + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) + break; + + *path = ch; + } + + if (!err) + memcpy(dn, &(dnode_path->dn), sizeof(*dn)); + + while (dnode_path) { + dn_new = dnode_path->next; + free(dnode_path); + dnode_path = dn_new; + } + free(path_buf); + return err; +} + + +/* + * Given a MOS metadnode, get the metadnode of a given filesystem name (fsname), + * e.g. pool/rootfs, or a given object number (obj), e.g. the object number + * of pool/rootfs. + * + * If no fsname and no obj are given, return the DSL_DIR metadnode. + * If fsname is given, return its metadnode and its matching object number. + * If only obj is given, return the metadnode for this object number. + * + */ +static int +get_filesystem_dnode(dnode_end_t *mosmdn, char *fsname, + dnode_end_t *mdn, struct grub_zfs_data *data) +{ + uint64_t objnum; + int err; + + err = dnode_get(mosmdn, DMU_POOL_DIRECTORY_OBJECT, + DMU_OT_OBJECT_DIRECTORY, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, DMU_POOL_ROOT_DATASET, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + while (*fsname) { + uint64_t childobj; + char *cname, ch; + + while (*fsname == '/') + fsname++; + + if (!*fsname || *fsname == '@') + break; + + cname = fsname; + while (*fsname && !isspace(*fsname) && *fsname != '/') + fsname++; + ch = *fsname; + *fsname = 0; + + childobj = grub_zfs_to_cpu64((((dsl_dir_phys_t *) DN_BONUS(&mdn->dn)))->dd_child_dir_zapobj, mdn->endian); + err = dnode_get(mosmdn, childobj, + DMU_OT_DSL_DIR_CHILD_MAP, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, cname, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + *fsname = ch; + } + return ZFS_ERR_NONE; +} + +static int +make_mdn(dnode_end_t *mdn, struct grub_zfs_data *data) +{ + void *osp; + blkptr_t *bp; + size_t ospsize; + int err; + + bp = &(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_bp); + err = zio_read(bp, mdn->endian, &osp, &ospsize, data); + if (err) + return err; + if (ospsize < OBJSET_PHYS_SIZE_V14) { + free(osp); + printf("too small osp\n"); + return ZFS_ERR_BAD_FS; + } + + mdn->endian = (grub_zfs_to_cpu64(bp->blk_prop, mdn->endian)>>63) & 1; + memmove((char *) &(mdn->dn), + (char *) &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + free(osp); + return ZFS_ERR_NONE; +} + +static int +dnode_get_fullpath(const char *fullpath, dnode_end_t *mdn, + uint64_t *mdnobj, dnode_end_t *dn, int *isfs, + struct grub_zfs_data *data) +{ + char *fsname, *snapname; + const char *ptr_at, *filename; + uint64_t headobj; + int err; + + ptr_at = strchr(fullpath, '@'); + if (!ptr_at) { + *isfs = 1; + filename = 0; + snapname = 0; + fsname = strdup(fullpath); + } else { + const char *ptr_slash = strchr(ptr_at, '/'); + + *isfs = 0; + fsname = malloc(ptr_at - fullpath + 1); + if (!fsname) + return ZFS_ERR_OUT_OF_MEMORY; + memcpy(fsname, fullpath, ptr_at - fullpath); + fsname[ptr_at - fullpath] = 0; + if (ptr_at[1] && ptr_at[1] != '/') { + snapname = malloc(ptr_slash - ptr_at); + if (!snapname) { + free(fsname); + return ZFS_ERR_OUT_OF_MEMORY; + } + memcpy(snapname, ptr_at + 1, ptr_slash - ptr_at - 1); + snapname[ptr_slash - ptr_at - 1] = 0; + } else { + snapname = 0; + } + if (ptr_slash) + filename = ptr_slash; + else + filename = "/"; + printf("zfs fsname = '%s' snapname='%s' filename = '%s'\n", + fsname, snapname, filename); + } + + + err = get_filesystem_dnode(&(data->mos), fsname, dn, data); + + if (err) { + free(fsname); + free(snapname); + return err; + } + + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&dn->dn))->dd_head_dataset_obj, dn->endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + + if (snapname) { + uint64_t snapobj; + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_snapnames_zapobj, mdn->endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, mdn, data); + if (!err) + err = zap_lookup(mdn, snapname, &headobj, data); + if (!err) + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + } + + if (mdnobj) + *mdnobj = headobj; + + make_mdn(mdn, data); + + if (*isfs) { + free(fsname); + free(snapname); + return ZFS_ERR_NONE; + } + err = dnode_get_path(mdn, filename, dn, data); + free(fsname); + free(snapname); + return err; +} + +/* + * For a given XDR packed nvlist, verify the first 4 bytes and move on. + * + * An XDR packed nvlist is encoded as (comments from nvs_xdr_create) : + * + * encoding method/host endian (4 bytes) + * nvl_version (4 bytes) + * nvl_nvflag (4 bytes) + * encoded nvpairs: + * encoded size of the nvpair (4 bytes) + * decoded size of the nvpair (4 bytes) + * name string size (4 bytes) + * name string data (sizeof(NV_ALIGN4(string)) + * data type (4 bytes) + * # of elements in the nvpair (4 bytes) + * data + * 2 zero's for the last nvpair + * (end of the entire list) (8 bytes) + * + */ + +static int +nvlist_find_value(char *nvlist, char *name, int valtype, char **val, + size_t *size_out, size_t *nelm_out) +{ + int name_len, type, encode_size; + char *nvpair, *nvp_name; + + /* Verify if the 1st and 2nd byte in the nvlist are valid. */ + /* NOTE: independently of what endianness header announces all + subsequent values are big-endian. */ + if (nvlist[0] != NV_ENCODE_XDR || (nvlist[1] != NV_LITTLE_ENDIAN + && nvlist[1] != NV_BIG_ENDIAN)) { + printf("zfs incorrect nvlist header\n"); + return ZFS_ERR_BAD_FS; + } + + /* skip the header, nvl_version, and nvl_nvflag */ + nvlist = nvlist + 4 * 3; + /* + * Loop thru the nvpair list + * The XDR representation of an integer is in big-endian byte order. + */ + while ((encode_size = grub_be_to_cpu32(*(uint32_t *) nvlist))) { + int nelm; + + nvpair = nvlist + 4 * 2; /* skip the encode/decode size */ + + name_len = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nvp_name = nvpair; + nvpair = nvpair + ((name_len + 3) & ~3); /* align */ + + type = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nelm = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (nelm < 1) { + printf("empty nvpair\n"); + return ZFS_ERR_BAD_FS; + } + + nvpair += 4; + + if ((strncmp(nvp_name, name, name_len) == 0) && type == valtype) { + *val = nvpair; + *size_out = encode_size; + if (nelm_out) + *nelm_out = nelm; + return 1; + } + + nvlist += encode_size; /* goto the next nvpair */ + } + return 0; +} + +int +grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out) +{ + char *nvpair; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0); + if (!found) + return 0; + if (size < sizeof(uint64_t)) { + printf("invalid uint64\n"); + return ZFS_ERR_BAD_FS; + } + + *out = grub_be_to_cpu64(*(uint64_t *) nvpair); + return 1; +} + +char * +grub_zfs_nvlist_lookup_string(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t slen; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_STRING, &nvpair, &size, 0); + if (!found) + return 0; + if (size < 4) { + printf("invalid string\n"); + return 0; + } + slen = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (slen > size - 4) + slen = size - 4; + ret = malloc(slen + 1); + if (!ret) + return 0; + memcpy(ret, nvpair + 4, slen); + ret[slen] = 0; + return ret; +} + +char * +grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, 0); + if (!found) + return 0; + ret = calloc(1, size + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpair, size); + return ret; +} + +int +grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name) +{ + char *nvpair; + size_t nelm, size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return -1; + return nelm; +} + +char * +grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index) +{ + char *nvpair, *nvpairptr; + int found; + char *ret; + size_t size; + unsigned i; + size_t nelm; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return 0; + if (index >= nelm) { + printf("trying to lookup past nvlist array\n"); + return 0; + } + + nvpairptr = nvpair; + + for (i = 0; i < index; i++) { + uint32_t encode_size; + + /* skip the header, nvl_version, and nvl_nvflag */ + nvpairptr = nvpairptr + 4 * 2; + + while (nvpairptr < nvpair + size + && (encode_size = grub_be_to_cpu32(*(uint32_t *) nvpairptr))) + nvlist += encode_size; /* goto the next nvpair */ + + nvlist = nvlist + 4 * 2; /* skip the ending 2 zeros - 8 bytes */ + } + + if (nvpairptr >= nvpair + size + || nvpairptr + grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + >= nvpair + size) { + printf("incorrect nvlist array\n"); + return 0; + } + + ret = calloc(1, grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpairptr, size); + return ret; +} + +static int +zfs_fetch_nvlist(struct grub_zfs_data *data, char **nvlist) +{ + int err; + + *nvlist = malloc(VDEV_PHYS_SIZE); + /* Read in the vdev name-value pair list (112K). */ + err = zfs_devread(data->vdev_phys_sector, 0, VDEV_PHYS_SIZE, *nvlist); + if (err) { + free(*nvlist); + *nvlist = 0; + return err; + } + return ZFS_ERR_NONE; +} + +/* + * Check the disk label information and retrieve needed vdev name-value pairs. + * + */ +static int +check_pool_label(struct grub_zfs_data *data) +{ + uint64_t pool_state; + char *nvlist; /* for the pool */ + char *vdevnvlist; /* for the vdev */ + uint64_t diskguid; + uint64_t version; + int found; + int err; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) + return err; + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_STATE, + &pool_state); + if (!found) { + free(nvlist); + printf("zfs pool state not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (pool_state == POOL_STATE_DESTROYED) { + free(nvlist); + printf("zpool is marked as destroyed\n"); + return ZFS_ERR_BAD_FS; + } + + data->label_txg = 0; + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_TXG, + &data->label_txg); + if (!found) { + free(nvlist); + printf("zfs pool txg not found\n"); + return ZFS_ERR_BAD_FS; + } + + /* not an active device */ + if (data->label_txg == 0) { + free(nvlist); + printf("zpool is not active\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_VERSION, + &version); + if (!found) { + free(nvlist); + printf("zpool config version not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (version > SPA_VERSION) { + free(nvlist); + printf("SPA version too new %llu > %llu\n", + (unsigned long long) version, + (unsigned long long) SPA_VERSION); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + vdevnvlist = grub_zfs_nvlist_lookup_nvlist(nvlist, ZPOOL_CONFIG_VDEV_TREE); + if (!vdevnvlist) { + free(nvlist); + printf("ZFS config vdev tree not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT, + &data->vdev_ashift); + free(vdevnvlist); + if (!found) { + free(nvlist); + printf("ZPOOL config ashift not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_GUID, &diskguid); + if (!found) { + free(nvlist); + printf("ZPOOL config guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_GUID, &data->pool_guid); + if (!found) { + free(nvlist); + printf("ZPOOL config pool guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + free(nvlist); + + printf("ZFS Pool GUID: %llu (%016llx) Label: GUID: %llu (%016llx), txg: %llu, SPA v%llu, ashift: %llu\n", + (unsigned long long) data->pool_guid, + (unsigned long long) data->pool_guid, + (unsigned long long) diskguid, + (unsigned long long) diskguid, + (unsigned long long) data->label_txg, + (unsigned long long) version, + (unsigned long long) data->vdev_ashift); + + return ZFS_ERR_NONE; +} + +/* + * vdev_label_start returns the physical disk offset (in bytes) of + * label "l". + */ +static uint64_t vdev_label_start(uint64_t psize, int l) +{ + return (l * sizeof(vdev_label_t) + (l < VDEV_LABELS / 2 ? + 0 : psize - + VDEV_LABELS * sizeof(vdev_label_t))); +} + +void +zfs_unmount(struct grub_zfs_data *data) +{ + free(data->dnode_buf); + free(data->dnode_mdn); + free(data->file_buf); + free(data); +} + +/* + * zfs_mount() locates a valid uberblock of the root pool and read in its MOS + * to the memory address MOS. + * + */ +struct grub_zfs_data * +zfs_mount(device_t dev) +{ + struct grub_zfs_data *data = 0; + int label = 0, bestlabel = -1; + char *ub_array; + uberblock_t *ubbest; + uberblock_t *ubcur = NULL; + void *osp = 0; + size_t ospsize; + int err; + + data = malloc(sizeof(*data)); + if (!data) + return 0; + memset(data, 0, sizeof(*data)); + + ub_array = malloc(VDEV_UBERBLOCK_RING); + if (!ub_array) { + zfs_unmount(data); + return 0; + } + + ubbest = malloc(sizeof(*ubbest)); + if (!ubbest) { + zfs_unmount(data); + return 0; + } + memset(ubbest, 0, sizeof(*ubbest)); + + /* + * some eltorito stacks don't give us a size and + * we end up setting the size to MAXUINT, further + * some of these devices stop working once a single + * read past the end has been issued. Checking + * for a maximum part_length and skipping the backup + * labels at the end of the slice/partition/device + * avoids breaking down on such devices. + */ + const int vdevnum = + dev->part_length == 0 ? + VDEV_LABELS / 2 : VDEV_LABELS; + + /* Size in bytes of the device (disk or partition) aligned to label size*/ + uint64_t device_size = + dev->part_length << SECTOR_BITS; + + const uint64_t alignedbytes = + P2ALIGN(device_size, (uint64_t) sizeof(vdev_label_t)); + + for (label = 0; label < vdevnum; label++) { + uint64_t labelstartbytes = vdev_label_start(alignedbytes, label); + uint64_t labelstart = labelstartbytes >> SECTOR_BITS; + + debug("zfs reading label %d at sector %llu (byte %llu)\n", + label, (unsigned long long) labelstart, + (unsigned long long) labelstartbytes); + + data->vdev_phys_sector = labelstart + + ((VDEV_SKIP_SIZE + VDEV_BOOT_HEADER_SIZE) >> SECTOR_BITS); + + err = check_pool_label(data); + if (err) { + printf("zfs error checking label %d\n", label); + continue; + } + + /* Read in the uberblock ring (128K). */ + err = zfs_devread(data->vdev_phys_sector + + (VDEV_PHYS_SIZE >> SECTOR_BITS), + 0, VDEV_UBERBLOCK_RING, ub_array); + if (err) { + printf("zfs error reading uberblock ring for label %d\n", label); + continue; + } + + ubcur = find_bestub(ub_array, data); + if (!ubcur) { + printf("zfs No good uberblocks found in label %d\n", label); + continue; + } + + if (vdev_uberblock_compare(ubcur, ubbest) > 0) { + /* Looks like the block is good, so use it.*/ + memcpy(ubbest, ubcur, sizeof(*ubbest)); + bestlabel = label; + debug("zfs Current best uberblock found in label %d\n", label); + } + } + free(ub_array); + + /* We zero'd the structure to begin with. If we never assigned to it, + magic will still be zero. */ + if (!ubbest->ub_magic) { + printf("couldn't find a valid ZFS label\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + debug("zfs ubbest %p in label %d\n", ubbest, bestlabel); + + grub_zfs_endian_t ub_endian = + grub_zfs_to_cpu64(ubbest->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + ? LITTLE_ENDIAN : BIG_ENDIAN; + + debug("zfs endian set to %s\n", !ub_endian ? "big" : "little"); + + err = zio_read(&ubbest->ub_rootbp, ub_endian, &osp, &ospsize, data); + + if (err) { + printf("couldn't zio_read object directory\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + if (ospsize < OBJSET_PHYS_SIZE_V14) { + printf("osp too small\n"); + zfs_unmount(data); + free(osp); + free(ubbest); + return 0; + } + + /* Got the MOS. Save it at the memory addr MOS. */ + memmove(&(data->mos.dn), &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + data->mos.endian = + (grub_zfs_to_cpu64(ubbest->ub_rootbp.blk_prop, ub_endian) >> 63) & 1; + memmove(&(data->current_uberblock), ubbest, sizeof(uberblock_t)); + + free(osp); + free(ubbest); + + return data; +} + +int +grub_zfs_fetch_nvlist(device_t dev, char **nvlist) +{ + struct grub_zfs_data *zfs; + int err; + + zfs = zfs_mount(dev); + if (!zfs) + return ZFS_ERR_BAD_FS; + err = zfs_fetch_nvlist(zfs, nvlist); + zfs_unmount(zfs); + return err; +} + +static int +zfs_label(device_t device, char **label) +{ + char *nvlist; + int err; + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) { + zfs_unmount(data); + return err; + } + + *label = grub_zfs_nvlist_lookup_string(nvlist, ZPOOL_CONFIG_POOL_NAME); + free(nvlist); + zfs_unmount(data); + return ZFS_ERR_NONE; +} + +static int +zfs_uuid(device_t device, char **uuid) +{ + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + *uuid = malloc(17); /* %016llx + nil */ + if (!*uuid) + return ZFS_ERR_OUT_OF_MEMORY; + + /* *uuid = xasprintf ("%016llx", (long long unsigned) data->pool_guid);*/ + snprintf(*uuid, 17, "%016llx", (long long unsigned) data->pool_guid); + zfs_unmount(data); + + return ZFS_ERR_NONE; +} + +/* + * zfs_open() locates a file in the rootpool by following the + * MOS and places the dnode of the file in the memory address DNODE. + */ +int +zfs_open(struct zfs_file *file, const char *fsfilename) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(file->device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), 0, + &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + + if (isfs) { + zfs_unmount(data); + printf("Missing @ or / separator\n"); + return ZFS_ERR_FILE_NOT_FOUND; + } + + /* We found the dnode for this file. Verify if it is a plain file. */ + if (data->dnode.dn.dn_type != DMU_OT_PLAIN_FILE_CONTENTS) { + zfs_unmount(data); + printf("not a file\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + + /* get the file size and set the file position to 0 */ + + /* + * For DMU_OT_SA we will need to locate the SIZE attribute + * attribute, which could be either in the bonus buffer + * or the "spill" block. + */ + if (data->dnode.dn.dn_bonustype == DMU_OT_SA) { + void *sahdrp; + int hdrsize; + + if (data->dnode.dn.dn_bonuslen != 0) { + sahdrp = (sa_hdr_phys_t *) DN_BONUS(&data->dnode.dn); + } else if (data->dnode.dn.dn_flags & DNODE_FLAG_SPILL_BLKPTR) { + blkptr_t *bp = &data->dnode.dn.dn_spill; + + err = zio_read(bp, data->dnode.endian, &sahdrp, NULL, data); + if (err) + return err; + } else { + printf("filesystem is corrupt :(\n"); + return ZFS_ERR_BAD_FS; + } + + hdrsize = SA_HDR_SIZE(((sa_hdr_phys_t *) sahdrp)); + file->size = *(uint64_t *) ((char *) sahdrp + hdrsize + SA_SIZE_OFFSET); + } else { + file->size = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&data->dnode.dn))->zp_size, data->dnode.endian); + } + + file->data = data; + file->offset = 0; + + return ZFS_ERR_NONE; +} + +uint64_t +zfs_read(zfs_file_t file, char *buf, uint64_t len) +{ + struct grub_zfs_data *data = (struct grub_zfs_data *) file->data; + int blksz, movesize; + uint64_t length; + int64_t red; + int err; + + if (data->file_buf == NULL) { + data->file_buf = malloc(SPA_MAXBLOCKSIZE); + if (!data->file_buf) + return -1; + data->file_start = data->file_end = 0; + } + + /* + * If offset is in memory, move it into the buffer provided and return. + */ + if (file->offset >= data->file_start + && file->offset + len <= data->file_end) { + memmove(buf, data->file_buf + file->offset - data->file_start, + len); + return len; + } + + blksz = grub_zfs_to_cpu16(data->dnode.dn.dn_datablkszsec, + data->dnode.endian) << SPA_MINBLOCKSHIFT; + + /* + * Entire Dnode is too big to fit into the space available. We + * will need to read it in chunks. This could be optimized to + * read in as large a chunk as there is space available, but for + * now, this only reads in one data block at a time. + */ + length = len; + red = 0; + while (length) { + void *t; + /* + * Find requested blkid and the offset within that block. + */ + uint64_t blkid = (file->offset + red) / blksz; + free(data->file_buf); + data->file_buf = 0; + + err = dmu_read(&(data->dnode), blkid, &t, + 0, data); + data->file_buf = t; + if (err) + return -1; + + data->file_start = blkid * blksz; + data->file_end = data->file_start + blksz; + + movesize = MIN(length, data->file_end - (int) file->offset - red); + + memmove(buf, data->file_buf + file->offset + red + - data->file_start, movesize); + buf += movesize; + length -= movesize; + red += movesize; + } + + return len; +} + +int +zfs_close(zfs_file_t file) +{ + zfs_unmount((struct grub_zfs_data *) file->data); + return ZFS_ERR_NONE; +} + +int +grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(dev); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), mdnobj, + &(data->dnode), &isfs, data); + zfs_unmount(data); + return err; +} + +static void +fill_fs_info(struct zfs_dirhook_info *info, + dnode_end_t mdn, struct grub_zfs_data *data) +{ + int err; + dnode_end_t dn; + uint64_t objnum; + uint64_t headobj; + + memset(info, 0, sizeof(*info)); + + info->dir = 1; + + if (mdn.dn.dn_type == DMU_OT_DSL_DIR) { + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&mdn.dn))->dd_head_dataset_obj, mdn.endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &mdn, data); + if (err) { + printf("zfs failed here 1\n"); + return; + } + } + make_mdn(&mdn, data); + err = dnode_get(&mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &dn, data); + if (err) { + printf("zfs failed here 2\n"); + return; + } + + err = zap_lookup(&dn, ZFS_ROOT_OBJ, &objnum, data); + if (err) { + printf("zfs failed here 3\n"); + return; + } + + err = dnode_get(&mdn, objnum, 0, &dn, data); + if (err) { + printf("zfs failed here 4\n"); + return; + } + + info->mtimeset = 1; + info->mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + + return; +} + +static int iterate_zap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t dn; + + memset(&info, 0, sizeof(info)); + + dnode_get(&(data->mdn), val, 0, &dn, data); + info.mtimeset = 1; + info.mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + info.dir = (dn.dn.dn_type == DMU_OT_DIRECTORY_CONTENTS); + debug("zfs type=%d, name=%s\n", + (int)dn.dn.dn_type, (char *)name); + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_fs(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t mdn; + int err; + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + if (mdn.dn.dn_type != DMU_OT_DSL_DIR) + return 0; + + fill_fs_info(&info, mdn, data); + + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_snap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + char *name2; + int ret = 0; + dnode_end_t mdn; + int err; + + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + + if (mdn.dn.dn_type != DMU_OT_DSL_DATASET) + return 0; + + fill_fs_info(&info, mdn, data); + + name2 = malloc(strlen(name) + 2); + name2[0] = '@'; + memcpy(name2 + 1, name, strlen(name) + 1); + if (data->userhook) + ret = data->userhook(name2, &info); + free(name2); + return ret; +} + +int +zfs_ls(device_t device, const char *path, + int (*hook)(const char *, const struct zfs_dirhook_info *)) +{ + struct grub_zfs_data *data; + int err; + int isfs; +#if 0 + char *label = NULL; + + zfs_label(device, &label); + if (label) + printf("ZPOOL label '%s'\n", + label); +#endif + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + data->userhook = hook; + + err = dnode_get_fullpath(path, &(data->mdn), 0, &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + if (isfs) { + uint64_t childobj, headobj; + uint64_t snapobj; + dnode_end_t dn; + struct zfs_dirhook_info info; + + fill_fs_info(&info, data->dnode, data); + hook("@", &info); + + childobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_child_dir_zapobj, data->dnode.endian); + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_head_dataset_obj, data->dnode.endian); + err = dnode_get(&(data->mos), childobj, + DMU_OT_DSL_DIR_CHILD_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + + zap_iterate(&dn, iterate_zap_fs, data); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&dn.dn))->ds_snapnames_zapobj, dn.endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + zap_iterate(&dn, iterate_zap_snap, data); + } else { + if (data->dnode.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + zfs_unmount(data); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + zap_iterate(&(data->dnode), iterate_zap, data); + } + zfs_unmount(data); + return ZFS_ERR_NONE; +} + diff --git a/fs/zfs/zfs_fletcher.c b/fs/zfs/zfs_fletcher.c new file mode 100644 index 0000000..d96c6ff --- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,84 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +void +fletcher_2(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint64_t *ip = buf; + const uint64_t *ipend = ip + (size / sizeof(uint64_t)); + uint64_t a0, b0, a1, b1; + + for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 += grub_zfs_to_cpu64(ip[0], endian); + a1 += grub_zfs_to_cpu64(ip[1], endian); + b0 += a0; + b1 += a1; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a0, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(a1, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(b0, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(b1, endian); +} + +void +fletcher_4(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint32_t *ip = buf; + const uint32_t *ipend = ip + (size / sizeof(uint32_t)); + uint64_t a, b, c, d; + + for (a = b = c = d = 0; ip < ipend; ip++) { + a += grub_zfs_to_cpu32(ip[0], endian); + b += a; + c += b; + d += c; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(b, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(c, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(d, endian); +} + diff --git a/fs/zfs/zfs_lzjb.c b/fs/zfs/zfs_lzjb.c new file mode 100644 index 0000000..33e9b90 --- /dev/null +++ b/fs/zfs/zfs_lzjb.c @@ -0,0 +1,94 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +#define MATCH_BITS 6 +#define MATCH_MIN 3 +#define OFFSET_MASK ((1 << (16 - MATCH_BITS)) - 1) + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + +int +lzjb_decompress(void *s_start, void *d_start, uint32_t s_len, + uint32_t d_len) +{ + uint8_t *src = s_start; + uint8_t *dst = d_start; + uint8_t *d_end = (uint8_t *) d_start + d_len; + uint8_t *s_end = (uint8_t *) s_start + s_len; + uint8_t *cpy, copymap = 0; + int copymask = 1 << (NBBY - 1); + + while (dst < d_end && src < s_end) { + if ((copymask <<= 1) == (1 << NBBY)) { + copymask = 1; + copymap = *src++; + } + if (src >= s_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + if (copymap & copymask) { + int mlen = (src[0] >> (NBBY - MATCH_BITS)) + MATCH_MIN; + int offset = ((src[0] << NBBY) | src[1]) & OFFSET_MASK; + src += 2; + cpy = dst - offset; + if (src > s_end || cpy < (uint8_t *) d_start) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + while (--mlen >= 0 && dst < d_end) + *dst++ = *cpy++; + } else { + *dst++ = *src++; + } + } + if (dst < d_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; +} diff --git a/fs/zfs/zfs_sha256.c b/fs/zfs/zfs_sha256.c new file mode 100644 index 0000000..7a9439a --- /dev/null +++ b/fs/zfs/zfs_sha256.c @@ -0,0 +1,145 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +/* + * SHA-256 checksum, as specified in FIPS 180-2, available at: + * http://csrc.nist.gov/cryptval + * + * This is a very compact implementation of SHA-256. + * It is designed to be simple and portable, not to be fast. + */ + +/* + * The literal definitions according to FIPS180-2 would be: + * + * Ch(x, y, z) (((x) & (y)) ^ ((~(x)) & (z))) + * Maj(x, y, z) (((x) & (y)) | ((x) & (z)) | ((y) & (z))) + * + * We use logical equivalents which require one less op. + */ +#define Ch(x, y, z) ((z) ^ ((x) & ((y) ^ (z)))) +#define Maj(x, y, z) (((x) & (y)) ^ ((z) & ((x) ^ (y)))) +#define Rot32(x, s) (((x) >> s) | ((x) << (32 - s))) +#define SIGMA0(x) (Rot32(x, 2) ^ Rot32(x, 13) ^ Rot32(x, 22)) +#define SIGMA1(x) (Rot32(x, 6) ^ Rot32(x, 11) ^ Rot32(x, 25)) +#define sigma0(x) (Rot32(x, 7) ^ Rot32(x, 18) ^ ((x) >> 3)) +#define sigma1(x) (Rot32(x, 17) ^ Rot32(x, 19) ^ ((x) >> 10)) + +static const uint32_t SHA256_K[64] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +}; + +static void +SHA256Transform(uint32_t *H, const uint8_t *cp) +{ + uint32_t a, b, c, d, e, f, g, h, t, T1, T2, W[64]; + + for (t = 0; t < 16; t++, cp += 4) + W[t] = (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | cp[3]; + + for (t = 16; t < 64; t++) + W[t] = sigma1(W[t - 2]) + W[t - 7] + + sigma0(W[t - 15]) + W[t - 16]; + + a = H[0]; b = H[1]; c = H[2]; d = H[3]; + e = H[4]; f = H[5]; g = H[6]; h = H[7]; + + for (t = 0; t < 64; t++) { + T1 = h + SIGMA1(e) + Ch(e, f, g) + SHA256_K[t] + W[t]; + T2 = SIGMA0(a) + Maj(a, b, c); + h = g; g = f; f = e; e = d + T1; + d = c; c = b; b = a; a = T1 + T2; + } + + H[0] += a; H[1] += b; H[2] += c; H[3] += d; + H[4] += e; H[5] += f; H[6] += g; H[7] += h; +} + +void +zio_checksum_SHA256(const void *buf, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp) +{ + uint32_t H[8] = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, + 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 }; + uint8_t pad[128]; + unsigned padsize = size & 63; + unsigned i; + + for (i = 0; i < size - padsize; i += 64) + SHA256Transform(H, (uint8_t *)buf + i); + + for (i = 0; i < padsize; i++) + pad[i] = ((uint8_t *)buf)[i]; + + for (pad[padsize++] = 0x80; (padsize & 63) != 56; padsize++) + pad[padsize] = 0; + + for (i = 0; i < 8; i++) + pad[padsize++] = (size << 3) >> (56 - 8 * i); + + for (i = 0; i < padsize; i += 64) + SHA256Transform(H, pad + i); + + zcp->zc_word[0] = grub_cpu_to_zfs64((uint64_t)H[0] << 32 | H[1], + endian); + zcp->zc_word[1] = grub_cpu_to_zfs64((uint64_t)H[2] << 32 | H[3], + endian); + zcp->zc_word[2] = grub_cpu_to_zfs64((uint64_t)H[4] << 32 | H[5], + endian); + zcp->zc_word[3] = grub_cpu_to_zfs64((uint64_t)H[6] << 32 | H[7], + endian); +} diff --git a/include/config_cmd_all.h b/include/config_cmd_all.h index 55f4f7a..5933ae9 100644 --- a/include/config_cmd_all.h +++ b/include/config_cmd_all.h @@ -36,6 +36,7 @@ #define CONFIG_CMD_ELF /* ELF (VxWorks) load/boot cmd */ #define CONFIG_CMD_EXT2 /* EXT2 Support */ #define CONFIG_CMD_FAT /* FAT support */ +#define CONFIG_CMD_ZFS /* ZFS support */ #define CONFIG_CMD_FDC /* Floppy Disk Support */ #define CONFIG_CMD_FDOS /* Floppy DOS support */ #define CONFIG_CMD_FLASH /* flinfo, erase, protect */ diff --git a/include/zfs_common.h b/include/zfs_common.h new file mode 100644 index 0000000..969dbf5 --- /dev/null +++ b/include/zfs_common.h @@ -0,0 +1,94 @@ +/* + * ZFS filesystem implementation in Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __ZFS_COMMON__ +#define __ZFS_COMMON__ + +#define SECTOR_SIZE 0x200 +#define SECTOR_BITS 9 + +#define grub_le_to_cpu16 le16_to_cpu +#define grub_be_to_cpu16 be16_to_cpu +#define grub_le_to_cpu32 le32_to_cpu +#define grub_be_to_cpu32 be32_to_cpu +#define grub_le_to_cpu64 le64_to_cpu +#define grub_be_to_cpu64 be64_to_cpu + +#define grub_cpu_to_le64 cpu_to_le64 +#define grub_cpu_to_be64 cpu_to_be64 + +enum zfs_errors { + ZFS_ERR_NONE = 0, + ZFS_ERR_NOT_IMPLEMENTED_YET = -1, + ZFS_ERR_BAD_FS = -2, + ZFS_ERR_OUT_OF_MEMORY = -3, + ZFS_ERR_FILE_NOT_FOUND = -4, + ZFS_ERR_BAD_FILE_TYPE = -5, + ZFS_ERR_OUT_OF_RANGE = -6, +}; + +struct zfs_filesystem { + + /* Block Device Descriptor */ + block_dev_desc_t *dev_desc; +}; + + +extern block_dev_desc_t *zfs_dev_desc; + +struct device_s { + uint64_t part_length; +}; +typedef struct device_s *device_t; + +struct zfs_file { + device_t device; + uint64_t size; + void *data; + uint64_t offset; +}; + +typedef struct zfs_file *zfs_file_t; + +struct zfs_dirhook_info { + int dir; + int mtimeset; + time_t mtime; + time_t mtime2; +}; + + + + +struct zfs_filesystem *zfsget_fs(void); +int init_fs(block_dev_desc_t *dev_desc); +void deinit_fs(block_dev_desc_t *dev_desc); +int zfs_open(zfs_file_t, const char *filename); +uint64_t zfs_read(zfs_file_t, char *buf, uint64_t len); +struct grub_zfs_data *zfs_mount(device_t); +int zfs_close(zfs_file_t); +int zfs_ls(device_t dev, const char *path, + int (*hook) (const char *, const struct zfs_dirhook_info *)); +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf); +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part); +void zfs_unmount(struct grub_zfs_data *data); +int lzjb_decompress(void *, void *, uint32_t, uint32_t); +#endif

Hi Jorgen,
On Wed, May 23, 2012 at 12:26 PM, Jorgen Lundman lundman@lundman.net wrote:
commit bc192bb0716b02b2b711dc2df62ed15e1160ea50 Author: Jorgen Lundman lundman@lundman.net Date: Â Wed May 23 01:55:02 2012 +0000
[snip]
commit bea9588d98f52d95a325f3b71a7ae448242c7b64 Author: Jorgen Lundman lundman@lundman.net Date: Â Thu May 10 05:11:03 2012 +0000
What are all these commit references? Are they from an external git-repo? If so I think the commit message might make more sense if you simply summarise what you have done to adapt the original source code in order to integrate it into U-Boot
Adding ZFS
Makefile         |   2 +-  common/Makefile      |   1 +  common/cmd_zfs.c     |  244 +++++  fs/Makefile        |   1 +  fs/{ => zfs}/Makefile   |  43 +-  fs/zfs/dev.c       |  139 +++  fs/zfs/zfs.c       | 2414 ++++++++++++++++++++++++++++++++++++++++++++++  fs/zfs/zfs_fletcher.c   |  84 ++  fs/zfs/zfs_lzjb.c     |  94 ++  fs/zfs/zfs_sha256.c    |  145 +++  include/config_cmd_all.h |   1 +  include/zfs_common.h   |  94 ++  12 files changed, 3246 insertions(+), 16 deletions(-)  create mode 100644 common/cmd_zfs.c  copy fs/{ => zfs}/Makefile (52%)  create mode 100644 fs/zfs/dev.c  create mode 100644 fs/zfs/zfs.c  create mode 100644 fs/zfs/zfs_fletcher.c  create mode 100644 fs/zfs/zfs_lzjb.c  create mode 100644 fs/zfs/zfs_sha256.c  create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 351a8f0..d3b84bf 100644 --- a/Makefile +++ b/Makefile @@ -244,7 +244,7 @@ endif  LIBS += arch/$(ARCH)/lib/lib$(ARCH).o  LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \     fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \
- fs/ubifs/libubifs.o
- fs/ubifs/libubifs.o fs/zfs/libzfs.o
LIBS += net/libnet.o  LIBS += disk/libdisk.o  LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 6e23baa..181a9ad 100644 --- a/common/Makefile +++ b/common/Makefile @@ -90,6 +90,7 @@ COBJS-$(CONFIG_CMD_ELF) += cmd_elf.o  COBJS-$(CONFIG_SYS_HUSH_PARSER) += cmd_exit.o  COBJS-$(CONFIG_CMD_EXT2) += cmd_ext2.o  COBJS-$(CONFIG_CMD_FAT) += cmd_fat.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o  COBJS-$(CONFIG_CMD_FDC)$(CONFIG_CMD_FDOS) += cmd_fdc.o  COBJS-$(CONFIG_OF_LIBFDT) += cmd_fdt.o fdt_support.o  COBJS-$(CONFIG_CMD_FDOS) += cmd_fdos.o
Please keep list sorted
diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..99c4318 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,244 @@ +/*
- ZFS filesystem implementation in Uboot by
- Jorgen Lundman <lundman at lundman.net>
- zfsfs support
- made from existing GRUB Sources by Sun, GNU and others.
- This program is free software; you can redistribute it and/or
- modify it under the terms of the GNU General Public License as
- published by the Free Software Foundation; either version 2 of
- the License, or (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Â Â Â Â See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 59 Temple Place, Suite 330, Boston,
- MA 02111-1307 USA
- */
+/*
- Changelog:
- 0.1 - The Epoch
- - lundman
- */
Don't include changelogs in source files - We have git for that :)
+#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h>
+#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif
+#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif
+#define DOS_PART_MAGIC_OFFSET Â 0x1fe +#define DOS_FS_TYPE_OFFSET Â Â 0x36 +#define DOS_FS32_TYPE_OFFSET Â 0x52
+static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc,
- char *argv[])
Can this be aligned better? Personally I prefer to put each parameter on it's own line once I hit the 80 character limit
+{
[snip]
diff --git a/fs/Makefile b/fs/Makefile index 22aad12..b0d62c6 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -25,6 +25,7 @@  subdirs-$(CONFIG_CMD_CRAMFS) := cramfs  subdirs-$(CONFIG_CMD_EXT2) += ext2  subdirs-$(CONFIG_CMD_FAT) += fat +subdirs-$(CONFIG_CMD_ZFS) += zfs  subdirs-$(CONFIG_CMD_FDOS) += fdos  subdirs-$(CONFIG_CMD_JFFS2) += jffs2  subdirs-$(CONFIG_CMD_REISER) += reiserfs
Keep sorted
diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 52% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..00ab9e6 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,10 @@ Â # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2006 +# Wolfgang Denk, DENX Software Engineering, <wd at denx.de>
Please don't mess with other devs copyright lines
+# +# (C) Copyright 2003 +# Pavel Bartusek, Sysgo Real-Time Solutions AG, <pba at sysgo.de> +#  #  # See file CREDITS for list of people who contributed to this  # project. @@ -20,19 +24,28 @@  # Foundation, Inc., 59 Temple Place, Suite 330, Boston,  # MA 02111-1307 USA  # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk
+LIB Â Â = $(obj)libzfs.o
+AOBJS Â = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o
+SRCS Â := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS Â := $(addprefix $(obj),$(AOBJS) $(COBJS-y))
+all: Â $(LIB) $(AOBJS)
+$(LIB): Â Â Â Â $(obj).depend $(OBJS)
- $(call cmd_link_o_target, $(OBJS))
+#########################################################################
+# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS Â Â Â Â := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all:
- @for dir in $(SUBDIRS) ; do \
- $(MAKE) -C $$dir $@ ; done
+#########################################################################
This looks to be an unrelated change - If needed, move this into a seperate patch
diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..d61ff80 --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,139 @@
+int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{
- zfs_block_dev_desc = rbdd;
- if (part == 0) {
- /* disk doesn't use partition table */
- part_info.start = 0;
- part_info.size = rbdd->lba;
- part_info.blksz = rbdd->blksz;
- } else {
- if (get_partition_info
- (zfs_block_dev_desc, part, &part_info)) {
- return 0;
- }
- }
Insert a blank line
- return part_info.size;
+}
+/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{
- short sec_buffer[SECTOR_SIZE/sizeof(short)];
- char *sec_buf = sec_buffer;
- unsigned block_len;
- /*
- * Â Â Â Check partition boundaries
- */
- if ((sector < 0)
- || ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >=
- part_info.size)) {
Put the || operator on the previous line and fix vertical alignment of subsequent lines
- /* Â Â Â Â Â Â Â errnum = ERR_OUTSIDE_PART; */
- printf(" ** zfs_devread() read outside partition sector %d\n", sector);
- return 1;
- }
- /*
- * Â Â Â Get the read to the beginning of a partition.
- */
- sector += byte_offset >> SECTOR_BITS;
- byte_offset &= SECTOR_SIZE - 1;
- debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len);
- if (zfs_block_dev_desc == NULL) {
- printf("** Invalid Block Device Descriptor (NULL)\n");
- return 1;
- }
- if (byte_offset != 0) {
- /* read first part which isn't aligned with start of sector */
- if (zfs_block_dev_desc->
Ewww, that's a very ugly split
- block_read(zfs_block_dev_desc->dev,
- part_info.start + sector, 1,
- (unsigned long *) sec_buf) != 1) {
alignment - should look more like:
+ Â Â Â Â Â Â Â if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â part_info.start + sector, + 1, + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (unsigned long *) sec_buf) != 1) {
[snip]
diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..e7369cd --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2414 @@ +/*
- ZFS filesystem implementation in u-boot by
'ported to' versus 'implementation' ?
- Jorgen Lundman <lundman at lundman.net>
- ZFS-fs support
- made from existing GRUB Sources by Sun, GNU and others.
Is this an identical copyright attribution from the original source?
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Â Â Â Â See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
+#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h"
+block_dev_desc_t *zfs_dev_desc;
+/*
- GRUB Â -- Â GRand Unified Bootloader
- Copyright (C) 1999,2000,2001,2002,2003,2004,2009,2010
- Free Software Foundation, Inc.
- Copyright 2010 Â Sun Microsystems, Inc.
- GRUB is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 3 of the License, or
- (at your option) any later version.
- GRUB is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Â See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with GRUB. Â If not, see http://www.gnu.org/licenses/.
- */
Wow, I can't believe they put the copyright notice here! Maybe move it above the #includes
I quickly scanned the rest - nothing else not mentioned above jumped out at me. For a first cut, this looks pretty good
Regards,
Graeme

What are all these commit references? Are they from an external git-repo? If so I think the commit message might make more sense if you simply summarise what you have done to adapt the original source code in order to integrate it into U-Boot
Hmm I assumed --squash would git rid of that history, I guess not. I'll find a way.
Please keep list sorted
I didn't even notice it was sorted. Of course this leave the issue of:
COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
Not sure if I was supposed to go before or after that dangling spl.
Don't include changelogs in source files - We have git for that :)
Can this be aligned better? Personally I prefer to put each parameter on it's own line once I hit the 80 character limit
Keep sorted
All done.
Please don't mess with other devs copyright lines
Oops.
This looks to be an unrelated change - If needed, move this into a seperate patch
Actually, git noticed I copied the ../Makefile and changed it for zfs/Makefile. That's pretty neat.
Insert a blank line
Put the || operator on the previous line and fix vertical alignment of subsequent lines
Ewww, that's a very ugly split
Lesson here is don't assume file taken from ext2/dev.c to be ok :)
'ported to' versus 'implementation' ?
Ah of course.
Is this an identical copyright attribution from the original source?
Changed it to only have original license.
Once I test compile and compliance again, I will resend for your scrutiny.
Lund

Hi Jorgen,
On Wed, May 23, 2012 at 1:11 PM, Jorgen Lundman lundman@lundman.net wrote:
What are all these commit references? Are they from an external git-repo? If so I think the commit message might make more sense if you simply summarise what you have done to adapt the original source code in order to integrate it into U-Boot
Hmm I assumed --squash would git rid of that history, I guess not. I'll find a way.
Please keep list sorted
I didn't even notice it was sorted. Of course this leave the issue of:
COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
Not sure if I was supposed to go before or after that dangling spl.
Have a patch that sorts the list then add zfs :)
This looks to be an unrelated change - If needed, move this into a seperate patch
Actually, git noticed I copied the ../Makefile and changed it for zfs/Makefile. That's pretty neat.
Thats OK - It just looked odd
Once I test compile and compliance again, I will resend for your scrutiny.
Best to wait a couple of days - I'm sure there will be other comments
And don't forget to rev you patch series and add revision notes below the ---
Regards,
Graeme

Dear Jorgen Lundman,
In message 4FBC5578.5000404@lundman.net you wrote:
I didn't even notice it was sorted. Of course this leave the issue of:
COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
Not sure if I was supposed to go before or after that dangling spl.
CONFIG_YAFFS2 should be renamed into CONFIG_CMD_YAFFS2, and both CONFIG_CMD_YAFFS2 and CONFIG_CMD_SPL should be sorted, too. But that would be a separate patch.
Best regards,
Wolfgang Denk

ZFS filesystem support from GRUB. Adding 'zfsload' and 'zfsls' commands for filesystem access. ZFS pool notation syntax is in the format '/POOLNAME/@/directory/directory/file', also explained in help output.
Initial revision given to GRUB is found: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Using "either version 2 of the License, or (at your option) any later version."
Jorgen Lundman (2): ZFS header files zfs: Add ZFS support
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 29 files changed, 4687 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h

Dear Jorgen Lundman,
In message 1337744719-27487-1-git-send-email-lundman@lundman.net you wrote:
ZFS filesystem support from GRUB. Adding 'zfsload' and 'zfsls' commands for filesystem access. ZFS pool notation syntax is in the format '/POOLNAME/@/directory/directory/file', also explained in help output.
Initial revision given to GRUB is found: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Using "either version 2 of the License, or (at your option) any later version."
Please move this information into the actual patch commit messages. This cover letter is lost in the git history.
Best regards,
Wolfgang Denk

Signed-off-by: Jorgen Lundman lundman@lundman.net --- include/zfs/dmu.h | 119 ++++++++++++++++ include/zfs/dmu_objset.h | 43 ++++++ include/zfs/dnode.h | 80 +++++++++++ include/zfs/dsl_dataset.h | 52 +++++++ include/zfs/dsl_dir.h | 48 +++++++ include/zfs/sa_impl.h | 34 +++++ include/zfs/spa.h | 311 ++++++++++++++++++++++++++++++++++++++++++ include/zfs/uberblock_impl.h | 57 ++++++++ include/zfs/vdev_impl.h | 69 +++++++++ include/zfs/zap_impl.h | 112 +++++++++++++++ include/zfs/zap_leaf.h | 103 ++++++++++++++ include/zfs/zfs.h | 122 +++++++++++++++++ include/zfs/zfs_acl.h | 55 ++++++++ include/zfs/zfs_znode.h | 70 ++++++++++ include/zfs/zil.h | 56 ++++++++ include/zfs/zio.h | 92 +++++++++++++ include/zfs/zio_checksum.h | 49 +++++++ 17 files changed, 1472 insertions(+), 0 deletions(-) create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h
diff --git a/include/zfs/dmu.h b/include/zfs/dmu.h new file mode 100644 index 0000000..bee317e --- /dev/null +++ b/include/zfs/dmu.h @@ -0,0 +1,119 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_H +#define _SYS_DMU_H + +/* + * This file describes the interface that the DMU provides for its + * consumers. + * + * The DMU also interacts with the SPA. That interface is described in + * dmu_spa.h. + */ +typedef enum dmu_object_type { + DMU_OT_NONE, + /* general: */ + DMU_OT_OBJECT_DIRECTORY, /* ZAP */ + DMU_OT_OBJECT_ARRAY, /* UINT64 */ + DMU_OT_PACKED_NVLIST, /* UINT8 (XDR by nvlist_pack/unpack) */ + DMU_OT_PACKED_NVLIST_SIZE, /* UINT64 */ + DMU_OT_BPLIST, /* UINT64 */ + DMU_OT_BPLIST_HDR, /* UINT64 */ + /* spa: */ + DMU_OT_SPACE_MAP_HEADER, /* UINT64 */ + DMU_OT_SPACE_MAP, /* UINT64 */ + /* zil: */ + DMU_OT_INTENT_LOG, /* UINT64 */ + /* dmu: */ + DMU_OT_DNODE, /* DNODE */ + DMU_OT_OBJSET, /* OBJSET */ + /* dsl: */ + DMU_OT_DSL_DIR, /* UINT64 */ + DMU_OT_DSL_DIR_CHILD_MAP, /* ZAP */ + DMU_OT_DSL_DS_SNAP_MAP, /* ZAP */ + DMU_OT_DSL_PROPS, /* ZAP */ + DMU_OT_DSL_DATASET, /* UINT64 */ + /* zpl: */ + DMU_OT_ZNODE, /* ZNODE */ + DMU_OT_OLDACL, /* OLD ACL */ + DMU_OT_PLAIN_FILE_CONTENTS, /* UINT8 */ + DMU_OT_DIRECTORY_CONTENTS, /* ZAP */ + DMU_OT_MASTER_NODE, /* ZAP */ + DMU_OT_UNLINKED_SET, /* ZAP */ + /* zvol: */ + DMU_OT_ZVOL, /* UINT8 */ + DMU_OT_ZVOL_PROP, /* ZAP */ + /* other; for testing only! */ + DMU_OT_PLAIN_OTHER, /* UINT8 */ + DMU_OT_UINT64_OTHER, /* UINT64 */ + DMU_OT_ZAP_OTHER, /* ZAP */ + /* new object types: */ + DMU_OT_ERROR_LOG, /* ZAP */ + DMU_OT_SPA_HISTORY, /* UINT8 */ + DMU_OT_SPA_HISTORY_OFFSETS, /* spa_his_phys_t */ + DMU_OT_POOL_PROPS, /* ZAP */ + DMU_OT_DSL_PERMS, /* ZAP */ + DMU_OT_ACL, /* ACL */ + DMU_OT_SYSACL, /* SYSACL */ + DMU_OT_FUID, /* FUID table (Packed NVLIST UINT8) */ + DMU_OT_FUID_SIZE, /* FUID table size UINT64 */ + DMU_OT_NEXT_CLONES, /* ZAP */ + DMU_OT_SCRUB_QUEUE, /* ZAP */ + DMU_OT_USERGROUP_USED, /* ZAP */ + DMU_OT_USERGROUP_QUOTA, /* ZAP */ + DMU_OT_USERREFS, /* ZAP */ + DMU_OT_DDT_ZAP, /* ZAP */ + DMU_OT_DDT_STATS, /* ZAP */ + DMU_OT_SA, /* System attr */ + DMU_OT_SA_MASTER_NODE, /* ZAP */ + DMU_OT_SA_ATTR_REGISTRATION, /* ZAP */ + DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ + DMU_OT_NUMTYPES +} dmu_object_type_t; + +typedef enum dmu_objset_type { + DMU_OST_NONE, + DMU_OST_META, + DMU_OST_ZFS, + DMU_OST_ZVOL, + DMU_OST_OTHER, /* For testing only! */ + DMU_OST_ANY, /* Be careful! */ + DMU_OST_NUMTYPES +} dmu_objset_type_t; + +/* + * The names of zap entries in the DIRECTORY_OBJECT of the MOS. + */ +#define DMU_POOL_DIRECTORY_OBJECT 1 +#define DMU_POOL_CONFIG "config" +#define DMU_POOL_ROOT_DATASET "root_dataset" +#define DMU_POOL_SYNC_BPLIST "sync_bplist" +#define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" +#define DMU_POOL_ERRLOG_LAST "errlog_last" +#define DMU_POOL_SPARES "spares" +#define DMU_POOL_DEFLATE "deflate" +#define DMU_POOL_HISTORY "history" +#define DMU_POOL_PROPS "pool_props" +#define DMU_POOL_L2CACHE "l2cache" + +#endif /* _SYS_DMU_H */ diff --git a/include/zfs/dmu_objset.h b/include/zfs/dmu_objset.h new file mode 100644 index 0000000..176cad7 --- /dev/null +++ b/include/zfs/dmu_objset.h @@ -0,0 +1,43 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * Copyright (C) 2010 Robert Millan rmh@gnu.org + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_OBJSET_H +#define _SYS_DMU_OBJSET_H + +#include <zfs/zil.h> + +#define OBJSET_PHYS_SIZE 2048 +#define OBJSET_PHYS_SIZE_V14 1024 + +typedef struct objset_phys { + dnode_phys_t os_meta_dnode; + zil_header_t os_zil_header; + uint64_t os_type; + uint64_t os_flags; + char os_pad[OBJSET_PHYS_SIZE - sizeof(dnode_phys_t)*3 - + sizeof(zil_header_t) - sizeof(uint64_t)*2]; + dnode_phys_t os_userused_dnode; + dnode_phys_t os_groupused_dnode; +} objset_phys_t; + +#endif /* _SYS_DMU_OBJSET_H */ diff --git a/include/zfs/dnode.h b/include/zfs/dnode.h new file mode 100644 index 0000000..9ec3d43 --- /dev/null +++ b/include/zfs/dnode.h @@ -0,0 +1,80 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DNODE_H +#define _SYS_DNODE_H + +#include <zfs/spa.h> + +/* + * Fixed constants. + */ +#define DNODE_SHIFT 9 /* 512 bytes */ +#define DN_MIN_INDBLKSHIFT 10 /* 1k */ +#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DNODE_BLOCK_SHIFT 14 /* 16k */ +#define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ +#define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ +#define DN_MAX_OFFSET_SHIFT 64 /* 2^64 bytes in a dnode */ + +/* + * Derived constants. + */ +#define DNODE_SIZE (1 << DNODE_SHIFT) +#define DN_MAX_NBLKPTR ((DNODE_SIZE - DNODE_CORE_SIZE) >> SPA_BLKPTRSHIFT) +#define DN_MAX_BONUSLEN (DNODE_SIZE - DNODE_CORE_SIZE - (1 << SPA_BLKPTRSHIFT)) +#define DN_MAX_OBJECT (1ULL << DN_MAX_OBJECT_SHIFT) + +#define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) +#define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) +#define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) + +#define DNODE_FLAG_SPILL_BLKPTR (1<<2) + +#define DN_BONUS(dnp) ((void *)((dnp)->dn_bonus + \ + (((dnp)->dn_nblkptr - 1) * sizeof(blkptr_t)))) + +typedef struct dnode_phys { + uint8_t dn_type; /* dmu_object_type_t */ + uint8_t dn_indblkshift; /* ln2(indirect block size) */ + uint8_t dn_nlevels; /* 1=dn_blkptr->data blocks */ + uint8_t dn_nblkptr; /* length of dn_blkptr */ + uint8_t dn_bonustype; /* type of data in bonus buffer */ + uint8_t dn_checksum; /* ZIO_CHECKSUM type */ + uint8_t dn_compress; /* ZIO_COMPRESS type */ + uint8_t dn_flags; /* DNODE_FLAG_* */ + uint16_t dn_datablkszsec; /* data block size in 512b sectors */ + uint16_t dn_bonuslen; /* length of dn_bonus */ + uint8_t dn_pad2[4]; + + /* accounting is protected by dn_dirty_mtx */ + uint64_t dn_maxblkid; /* largest allocated block ID */ + uint64_t dn_used; /* bytes (or sectors) of disk space */ + + uint64_t dn_pad3[4]; + + blkptr_t dn_blkptr[1]; + uint8_t dn_bonus[DN_MAX_BONUSLEN - sizeof(blkptr_t)]; + blkptr_t dn_spill; +} dnode_phys_t; + +#endif /* _SYS_DNODE_H */ diff --git a/include/zfs/dsl_dataset.h b/include/zfs/dsl_dataset.h new file mode 100644 index 0000000..c6de7ab --- /dev/null +++ b/include/zfs/dsl_dataset.h @@ -0,0 +1,52 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DATASET_H +#define _SYS_DSL_DATASET_H + +typedef struct dsl_dataset_phys { + uint64_t ds_dir_obj; + uint64_t ds_prev_snap_obj; + uint64_t ds_prev_snap_txg; + uint64_t ds_next_snap_obj; + uint64_t ds_snapnames_zapobj; /* zap obj of snaps; ==0 for snaps */ + uint64_t ds_num_children; /* clone/snap children; ==0 for head */ + uint64_t ds_creation_time; /* seconds since 1970 */ + uint64_t ds_creation_txg; + uint64_t ds_deadlist_obj; + uint64_t ds_used_bytes; + uint64_t ds_compressed_bytes; + uint64_t ds_uncompressed_bytes; + uint64_t ds_unique_bytes; /* only relevant to snapshots */ + /* + * The ds_fsid_guid is a 56-bit ID that can change to avoid + * collisions. The ds_guid is a 64-bit ID that will never + * change, so there is a small probability that it will collide. + */ + uint64_t ds_fsid_guid; + uint64_t ds_guid; + uint64_t ds_flags; + blkptr_t ds_bp; + uint64_t ds_pad[8]; /* pad out to 320 bytes for good measure */ +} dsl_dataset_phys_t; + +#endif /* _SYS_DSL_DATASET_H */ diff --git a/include/zfs/dsl_dir.h b/include/zfs/dsl_dir.h new file mode 100644 index 0000000..c04e0b6 --- /dev/null +++ b/include/zfs/dsl_dir.h @@ -0,0 +1,48 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DIR_H +#define _SYS_DSL_DIR_H + +typedef struct dsl_dir_phys { + uint64_t dd_creation_time; /* not actually used */ + uint64_t dd_head_dataset_obj; + uint64_t dd_parent_obj; + uint64_t dd_clone_parent_obj; + uint64_t dd_child_dir_zapobj; + /* + * how much space our children are accounting for; for leaf + * datasets, == physical space used by fs + snaps + */ + uint64_t dd_used_bytes; + uint64_t dd_compressed_bytes; + uint64_t dd_uncompressed_bytes; + /* Administrative quota setting */ + uint64_t dd_quota; + /* Administrative reservation setting */ + uint64_t dd_reserved; + uint64_t dd_props_zapobj; + uint64_t dd_deleg_zapobj; /* dataset permissions */ + uint64_t dd_pad[20]; /* pad out to 256 bytes for good measure */ +} dsl_dir_phys_t; + +#endif /* _SYS_DSL_DIR_H */ diff --git a/include/zfs/sa_impl.h b/include/zfs/sa_impl.h new file mode 100644 index 0000000..4ec49fe --- /dev/null +++ b/include/zfs/sa_impl.h @@ -0,0 +1,34 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ +#ifndef _SYS_SA_IMPL_H +#define _SYS_SA_IMPL_H + +typedef struct sa_hdr_phys { + uint32_t sa_magic; + uint16_t sa_layout_info; + uint16_t sa_lengths[1]; +} sa_hdr_phys_t; + +#define SA_HDR_SIZE(hdr) BF32_GET_SB(hdr->sa_layout_info, 10, 16, 3, 0) +#define SA_SIZE_OFFSET 0x8 + +#endif /* _SYS_SA_IMPL_H */ diff --git a/include/zfs/spa.h b/include/zfs/spa.h new file mode 100644 index 0000000..100e2a6 --- /dev/null +++ b/include/zfs/spa.h @@ -0,0 +1,311 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2010 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef GRUB_ZFS_SPA_HEADER +#define GRUB_ZFS_SPA_HEADER 1 + +typedef enum grub_zfs_endian { + UNKNOWN_ENDIAN = -2, + LITTLE_ENDIAN = -1, + BIG_ENDIAN = 0 +} grub_zfs_endian_t; + + +#define grub_zfs_to_cpu16(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu16(x) \ + : grub_le_to_cpu16(x)) +#define grub_cpu_to_zfs16(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be16(x) \ + : grub_cpu_to_le16(x)) + +#define grub_zfs_to_cpu32(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu32(x) \ + : grub_le_to_cpu32(x)) +#define grub_cpu_to_zfs32(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be32(x) \ + : grub_cpu_to_le32(x)) + +#define grub_zfs_to_cpu64(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu64(x) \ + : grub_le_to_cpu64(x)) +#define grub_cpu_to_zfs64(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be64(x) \ + : grub_cpu_to_le64(x)) + +/* + * General-purpose 32-bit and 64-bit bitfield encodings. + */ +#define BF32_DECODE(x, low, len) P2PHASE((x) >> (low), 1U << (len)) +#define BF64_DECODE(x, low, len) P2PHASE((x) >> (low), 1ULL << (len)) +#define BF32_ENCODE(x, low, len) (P2PHASE((x), 1U << (len)) << (low)) +#define BF64_ENCODE(x, low, len) (P2PHASE((x), 1ULL << (len)) << (low)) + +#define BF32_GET(x, low, len) BF32_DECODE(x, low, len) +#define BF64_GET(x, low, len) BF64_DECODE(x, low, len) + +#define BF32_SET(x, low, len, val) \ + ((x) ^= BF32_ENCODE((x >> low) ^ (val), low, len)) +#define BF64_SET(x, low, len, val) \ + ((x) ^= BF64_ENCODE((x >> low) ^ (val), low, len)) + +#define BF32_GET_SB(x, low, len, shift, bias) \ + ((BF32_GET(x, low, len) + (bias)) << (shift)) +#define BF64_GET_SB(x, low, len, shift, bias) \ + ((BF64_GET(x, low, len) + (bias)) << (shift)) + +#define BF32_SET_SB(x, low, len, shift, bias, val) \ + BF32_SET(x, low, len, ((val) >> (shift)) - (bias)) +#define BF64_SET_SB(x, low, len, shift, bias, val) \ + BF64_SET(x, low, len, ((val) >> (shift)) - (bias)) + +/* + * We currently support nine block sizes, from 512 bytes to 128K. + * We could go higher, but the benefits are near-zero and the cost + * of COWing a giant block to modify one byte would become excessive. + */ +#define SPA_MINBLOCKSHIFT 9 +#define SPA_MAXBLOCKSHIFT 17 +#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT) +#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT) + +#define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) + +/* + * Size of block to hold the configuration data (a packed nvlist) + */ +#define SPA_CONFIG_BLOCKSIZE (1 << 14) + +/* + * The DVA size encodings for LSIZE and PSIZE support blocks up to 32MB. + * The ASIZE encoding should be at least 64 times larger (6 more bits) + * to support up to 4-way RAID-Z mirror mode with worst-case gang block + * overhead, three DVAs per bp, plus one more bit in case we do anything + * else that expands the ASIZE. + */ +#define SPA_LSIZEBITS 16 /* LSIZE up to 32M (2^16 * 512) */ +#define SPA_PSIZEBITS 16 /* PSIZE up to 32M (2^16 * 512) */ +#define SPA_ASIZEBITS 24 /* ASIZE up to 64 times larger */ + +/* + * All SPA data is represented by 128-bit data virtual addresses (DVAs). + * The members of the dva_t should be considered opaque outside the SPA. + */ +typedef struct dva { + uint64_t dva_word[2]; +} dva_t; + +/* + * Each block has a 256-bit checksum -- strong enough for cryptographic hashes. + */ +typedef struct zio_cksum { + uint64_t zc_word[4]; +} zio_cksum_t; + +/* + * Each block is described by its DVAs, time of birth, checksum, etc. + * The word-by-word, bit-by-bit layout of the blkptr is as follows: + * + * 64 56 48 40 32 24 16 8 0 + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 0 | vdev1 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 1 |G| offset1 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 2 | vdev2 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 3 |G| offset2 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 4 | vdev3 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 5 |G| offset3 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 6 |BDX|lvl| type | cksum | comp | PSIZE | LSIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 7 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 8 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 9 | physical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * a | logical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * b | fill count | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * c | checksum[0] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * d | checksum[1] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * e | checksum[2] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * f | checksum[3] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * + * Legend: + * + * vdev virtual device ID + * offset offset into virtual device + * LSIZE logical size + * PSIZE physical size (after compression) + * ASIZE allocated size (including RAID-Z parity and gang block headers) + * GRID RAID-Z layout information (reserved for future use) + * cksum checksum function + * comp compression function + * G gang block indicator + * B byteorder (endianness) + * D dedup + * X unused + * lvl level of indirection + * type DMU object type + * phys birth txg of block allocation; zero if same as logical birth txg + * log. birth transaction group in which the block was logically born + * fill count number of non-zero blocks under this bp + * checksum[4] 256-bit checksum of the data this bp describes + */ +#define SPA_BLKPTRSHIFT 7 /* blkptr_t is 128 bytes */ +#define SPA_DVAS_PER_BP 3 /* Number of DVAs in a bp */ + +typedef struct blkptr { + dva_t blk_dva[SPA_DVAS_PER_BP]; /* Data Virtual Addresses */ + uint64_t blk_prop; /* size, compression, type, etc */ + uint64_t blk_pad[2]; /* Extra space for the future */ + uint64_t blk_phys_birth; /* txg when block was allocated */ + uint64_t blk_birth; /* transaction group at birth */ + uint64_t blk_fill; /* fill count */ + zio_cksum_t blk_cksum; /* 256-bit checksum */ +} blkptr_t; + +/* + * Macros to get and set fields in a bp or DVA. + */ +#define DVA_GET_ASIZE(dva) \ + BF64_GET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0) +#define DVA_SET_ASIZE(dva, x) \ + BF64_SET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0, x) + +#define DVA_GET_GRID(dva) BF64_GET((dva)->dva_word[0], 24, 8) +#define DVA_SET_GRID(dva, x) BF64_SET((dva)->dva_word[0], 24, 8, x) + +#define DVA_GET_VDEV(dva) BF64_GET((dva)->dva_word[0], 32, 32) +#define DVA_SET_VDEV(dva, x) BF64_SET((dva)->dva_word[0], 32, 32, x) + +#define DVA_GET_GANG(dva) BF64_GET((dva)->dva_word[1], 63, 1) +#define DVA_SET_GANG(dva, x) BF64_SET((dva)->dva_word[1], 63, 1, x) + +#define BP_GET_LSIZE(bp) \ + BF64_GET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1) +#define BP_SET_LSIZE(bp, x) \ + BF64_SET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1, x) + +#define BP_GET_COMPRESS(bp) BF64_GET((bp)->blk_prop, 32, 8) +#define BP_SET_COMPRESS(bp, x) BF64_SET((bp)->blk_prop, 32, 8, x) + +#define BP_GET_CHECKSUM(bp) BF64_GET((bp)->blk_prop, 40, 8) +#define BP_SET_CHECKSUM(bp, x) BF64_SET((bp)->blk_prop, 40, 8, x) + +#define BP_GET_TYPE(bp) BF64_GET((bp)->blk_prop, 48, 8) +#define BP_SET_TYPE(bp, x) BF64_SET((bp)->blk_prop, 48, 8, x) + +#define BP_GET_LEVEL(bp) BF64_GET((bp)->blk_prop, 56, 5) +#define BP_SET_LEVEL(bp, x) BF64_SET((bp)->blk_prop, 56, 5, x) + +#define BP_GET_PROP_BIT_61(bp) BF64_GET((bp)->blk_prop, 61, 1) +#define BP_SET_PROP_BIT_61(bp, x) BF64_SET((bp)->blk_prop, 61, 1, x) + +#define BP_GET_DEDUP(bp) BF64_GET((bp)->blk_prop, 62, 1) +#define BP_SET_DEDUP(bp, x) BF64_SET((bp)->blk_prop, 62, 1, x) + +#define BP_GET_BYTEORDER(bp) (0 - BF64_GET((bp)->blk_prop, 63, 1)) +#define BP_SET_BYTEORDER(bp, x) BF64_SET((bp)->blk_prop, 63, 1, x) + +#define BP_PHYSICAL_BIRTH(bp) \ + ((bp)->blk_phys_birth ? (bp)->blk_phys_birth : (bp)->blk_birth) + +#define BP_SET_BIRTH(bp, logical, physical) \ + { \ + (bp)->blk_birth = (logical); \ + (bp)->blk_phys_birth = ((logical) == (physical) ? 0 : (physical)); \ + } + +#define BP_GET_ASIZE(bp) \ + (DVA_GET_ASIZE(&(bp)->blk_dva[0]) + DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_GET_UCSIZE(bp) \ + ((BP_GET_LEVEL(bp) > 0 || dmu_ot[BP_GET_TYPE(bp)].ot_metadata) ? \ + BP_GET_PSIZE(bp) : BP_GET_LSIZE(bp)); + +#define BP_GET_NDVAS(bp) \ + (!!DVA_GET_ASIZE(&(bp)->blk_dva[0]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_COUNT_GANG(bp) \ + (DVA_GET_GANG(&(bp)->blk_dva[0]) + \ + DVA_GET_GANG(&(bp)->blk_dva[1]) + \ + DVA_GET_GANG(&(bp)->blk_dva[2])) + +#define DVA_EQUAL(dva1, dva2) \ + ((dva1)->dva_word[1] == (dva2)->dva_word[1] && \ + (dva1)->dva_word[0] == (dva2)->dva_word[0]) + +#define BP_EQUAL(bp1, bp2) \ + (BP_PHYSICAL_BIRTH(bp1) == BP_PHYSICAL_BIRTH(bp2) && \ + DVA_EQUAL(&(bp1)->blk_dva[0], &(bp2)->blk_dva[0]) && \ + DVA_EQUAL(&(bp1)->blk_dva[1], &(bp2)->blk_dva[1]) && \ + DVA_EQUAL(&(bp1)->blk_dva[2], &(bp2)->blk_dva[2])) + +#define ZIO_CHECKSUM_EQUAL(zc1, zc2) \ + (0 == (((zc1).zc_word[0] - (zc2).zc_word[0]) | \ + ((zc1).zc_word[1] - (zc2).zc_word[1]) | \ + ((zc1).zc_word[2] - (zc2).zc_word[2]) | \ + ((zc1).zc_word[3] - (zc2).zc_word[3]))) + +#define DVA_IS_VALID(dva) (DVA_GET_ASIZE(dva) != 0) + +#define ZIO_SET_CHECKSUM(zcp, w0, w1, w2, w3) \ + { \ + (zcp)->zc_word[0] = w0; \ + (zcp)->zc_word[1] = w1; \ + (zcp)->zc_word[2] = w2; \ + (zcp)->zc_word[3] = w3; \ + } + +#define BP_IDENTITY(bp) (&(bp)->blk_dva[0]) +#define BP_IS_GANG(bp) DVA_GET_GANG(BP_IDENTITY(bp)) +#define BP_IS_HOLE(bp) ((bp)->blk_birth == 0) + +/* BP_IS_RAIDZ(bp) assumes no block compression */ +#define BP_IS_RAIDZ(bp) (DVA_GET_ASIZE(&(bp)->blk_dva[0]) > \ + BP_GET_PSIZE(bp)) + +#define BP_ZERO(bp) \ + { \ + (bp)->blk_dva[0].dva_word[0] = 0; \ + (bp)->blk_dva[0].dva_word[1] = 0; \ + (bp)->blk_dva[1].dva_word[0] = 0; \ + (bp)->blk_dva[1].dva_word[1] = 0; \ + (bp)->blk_dva[2].dva_word[0] = 0; \ + (bp)->blk_dva[2].dva_word[1] = 0; \ + (bp)->blk_prop = 0; \ + (bp)->blk_pad[0] = 0; \ + (bp)->blk_pad[1] = 0; \ + (bp)->blk_phys_birth = 0; \ + (bp)->blk_birth = 0; \ + (bp)->blk_fill = 0; \ + ZIO_SET_CHECKSUM(&(bp)->blk_cksum, 0, 0, 0, 0); \ + } + +#define BP_SPRINTF_LEN 320 + +#endif /* ! GRUB_ZFS_SPA_HEADER */ diff --git a/include/zfs/uberblock_impl.h b/include/zfs/uberblock_impl.h new file mode 100644 index 0000000..12daf98 --- /dev/null +++ b/include/zfs/uberblock_impl.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_UBERBLOCK_IMPL_H +#define _SYS_UBERBLOCK_IMPL_H + +#define UBMAX(a, b) ((a) > (b) ? (a) : (b)) + +/* + * The uberblock version is incremented whenever an incompatible on-disk + * format change is made to the SPA, DMU, or ZAP. + * + * Note: the first two fields should never be moved. When a storage pool + * is opened, the uberblock must be read off the disk before the version + * can be checked. If the ub_version field is moved, we may not detect + * version mismatch. If the ub_magic field is moved, applications that + * expect the magic number in the first word won't work. + */ +#define UBERBLOCK_MAGIC 0x00bab10c /* oo-ba-bloc! */ +#define UBERBLOCK_SHIFT 10 /* up to 1K */ + +typedef struct uberblock { + uint64_t ub_magic; /* UBERBLOCK_MAGIC */ + uint64_t ub_version; /* ZFS_VERSION */ + uint64_t ub_txg; /* txg of last sync */ + uint64_t ub_guid_sum; /* sum of all vdev guids */ + uint64_t ub_timestamp; /* UTC time of last sync */ + blkptr_t ub_rootbp; /* MOS objset_phys_t */ +} uberblock_t; + +#define VDEV_UBERBLOCK_SHIFT(as) UBMAX(as, UBERBLOCK_SHIFT) +#define UBERBLOCK_SIZE(as) (1ULL << VDEV_UBERBLOCK_SHIFT(as)) + +/* Number of uberblocks that can fit in the ring at a given ashift */ +#define UBERBLOCK_COUNT(as) (VDEV_UBERBLOCK_RING >> VDEV_UBERBLOCK_SHIFT(as)) + +#endif /* _SYS_UBERBLOCK_IMPL_H */ diff --git a/include/zfs/vdev_impl.h b/include/zfs/vdev_impl.h new file mode 100644 index 0000000..97033c9 --- /dev/null +++ b/include/zfs/vdev_impl.h @@ -0,0 +1,69 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_VDEV_IMPL_H +#define _SYS_VDEV_IMPL_H + +#define VDEV_SKIP_SIZE (8 << 10) +#define VDEV_BOOT_HEADER_SIZE (8 << 10) +#define VDEV_PHYS_SIZE (112 << 10) +#define VDEV_UBERBLOCK_RING (128 << 10) + +/* ZFS boot block */ +#define VDEV_BOOT_MAGIC 0x2f5b007b10cULL +#define VDEV_BOOT_VERSION 1 /* version number */ + +typedef struct vdev_boot_header { + uint64_t vb_magic; /* VDEV_BOOT_MAGIC */ + uint64_t vb_version; /* VDEV_BOOT_VERSION */ + uint64_t vb_offset; /* start offset (bytes) */ + uint64_t vb_size; /* size (bytes) */ + char vb_pad[VDEV_BOOT_HEADER_SIZE - 4 * sizeof(uint64_t)]; +} vdev_boot_header_t; + +typedef struct vdev_phys { + char vp_nvlist[VDEV_PHYS_SIZE - sizeof(zio_eck_t)]; + zio_eck_t vp_zbt; +} vdev_phys_t; + +typedef struct vdev_label { + char vl_pad[VDEV_SKIP_SIZE]; /* 8K */ + vdev_boot_header_t vl_boot_header; /* 8K */ + vdev_phys_t vl_vdev_phys; /* 112K */ + char vl_uberblock[VDEV_UBERBLOCK_RING]; /* 128K */ +} vdev_label_t; /* 256K total */ + +/* + * Size and offset of embedded boot loader region on each label. + * The total size of the first two labels plus the boot area is 4MB. + */ +#define VDEV_BOOT_OFFSET (2 * sizeof(vdev_label_t)) +#define VDEV_BOOT_SIZE (7ULL << 19) /* 3.5M */ + +/* + * Size of label regions at the start and end of each leaf device. + */ +#define VDEV_LABEL_START_SIZE (2 * sizeof(vdev_label_t) + VDEV_BOOT_SIZE) +#define VDEV_LABEL_END_SIZE (2 * sizeof(vdev_label_t)) +#define VDEV_LABELS 4 + +#endif /* _SYS_VDEV_IMPL_H */ diff --git a/include/zfs/zap_impl.h b/include/zfs/zap_impl.h new file mode 100644 index 0000000..65e9311 --- /dev/null +++ b/include/zfs/zap_impl.h @@ -0,0 +1,112 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_IMPL_H +#define _SYS_ZAP_IMPL_H + +#define ZAP_MAGIC 0x2F52AB2ABULL + +#define ZAP_HASHBITS 28 +#define MZAP_ENT_LEN 64 +#define MZAP_NAME_LEN (MZAP_ENT_LEN - 8 - 4 - 2) +#define MZAP_MAX_BLKSHIFT SPA_MAXBLOCKSHIFT +#define MZAP_MAX_BLKSZ (1 << MZAP_MAX_BLKSHIFT) + +typedef struct mzap_ent_phys { + uint64_t mze_value; + uint32_t mze_cd; + uint16_t mze_pad; /* in case we want to chain them someday */ + char mze_name[MZAP_NAME_LEN]; +} mzap_ent_phys_t; + +typedef struct mzap_phys { + uint64_t mz_block_type; /* ZBT_MICRO */ + uint64_t mz_salt; + uint64_t mz_pad[6]; + mzap_ent_phys_t mz_chunk[1]; + /* actually variable size depending on block size */ +} mzap_phys_t; + +/* + * The (fat) zap is stored in one object. It is an array of + * 1<<FZAP_BLOCK_SHIFT byte blocks. The layout looks like one of: + * + * ptrtbl fits in first block: + * [zap_phys_t zap_ptrtbl_shift < 6] [zap_leaf_t] ... + * + * ptrtbl too big for first block: + * [zap_phys_t zap_ptrtbl_shift >= 6] [zap_leaf_t] [ptrtbl] ... + * + */ + +#define ZBT_LEAF ((1ULL << 63) + 0) +#define ZBT_HEADER ((1ULL << 63) + 1) +#define ZBT_MICRO ((1ULL << 63) + 3) +/* any other values are ptrtbl blocks */ + +/* + * the embedded pointer table takes up half a block: + * block size / entry size (2^3) / 2 + */ +#define ZAP_EMBEDDED_PTRTBL_SHIFT(zap) (FZAP_BLOCK_SHIFT(zap) - 3 - 1) + +/* + * The embedded pointer table starts half-way through the block. Since + * the pointer table itself is half the block, it starts at (64-bit) + * word number (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap)). + */ +#define ZAP_EMBEDDED_PTRTBL_ENT(zap, idx) \ + ((uint64_t *)(zap)->zap_f.zap_phys) \ + [(idx) + (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap))] + +/* + * TAKE NOTE: + * If zap_phys_t is modified, zap_byteswap() must be modified. + */ +typedef struct zap_phys { + uint64_t zap_block_type; /* ZBT_HEADER */ + uint64_t zap_magic; /* ZAP_MAGIC */ + + struct zap_table_phys { + uint64_t zt_blk; /* starting block number */ + uint64_t zt_numblks; /* number of blocks */ + uint64_t zt_shift; /* bits to index it */ + uint64_t zt_nextblk; /* next (larger) copy start block */ + uint64_t zt_blks_copied; /* number source blocks copied */ + } zap_ptrtbl; + + uint64_t zap_freeblk; /* the next free block */ + uint64_t zap_num_leafs; /* number of leafs */ + uint64_t zap_num_entries; /* number of entries */ + uint64_t zap_salt; /* salt to stir into hash function */ + uint64_t zap_normflags; /* flags for u8_textprep_str() */ + uint64_t zap_flags; /* zap_flag_t */ + /* + * This structure is followed by padding, and then the embedded + * pointer table. The embedded pointer table takes up second + * half of the block. It is accessed using the + * ZAP_EMBEDDED_PTRTBL_ENT() macro. + */ +} zap_phys_t; + +#endif /* _SYS_ZAP_IMPL_H */ diff --git a/include/zfs/zap_leaf.h b/include/zfs/zap_leaf.h new file mode 100644 index 0000000..4ddddb5 --- /dev/null +++ b/include/zfs/zap_leaf.h @@ -0,0 +1,103 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_LEAF_H +#define _SYS_ZAP_LEAF_H + +#define ZAP_LEAF_MAGIC 0x2AB1EAF + +/* chunk size = 24 bytes */ +#define ZAP_LEAF_CHUNKSIZE 24 + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +typedef enum zap_chunk_type { + ZAP_CHUNK_FREE = 253, + ZAP_CHUNK_ENTRY = 252, + ZAP_CHUNK_ARRAY = 251, + ZAP_CHUNK_TYPE_MAX = 250 +} zap_chunk_type_t; + +/* + * TAKE NOTE: + * If zap_leaf_phys_t is modified, zap_leaf_byteswap() must be modified. + */ +typedef struct zap_leaf_phys { + struct zap_leaf_header { + uint64_t lh_block_type; /* ZBT_LEAF */ + uint64_t lh_pad1; + uint64_t lh_prefix; /* hash prefix of this leaf */ + uint32_t lh_magic; /* ZAP_LEAF_MAGIC */ + uint16_t lh_nfree; /* number free chunks */ + uint16_t lh_nentries; /* number of entries */ + uint16_t lh_prefix_len; /* num bits used to id this */ + + /* above is accessable to zap, below is zap_leaf private */ + + uint16_t lh_freelist; /* chunk head of free list */ + uint8_t lh_pad2[12]; + } l_hdr; /* 2 24-byte chunks */ + + /* + * The header is followed by a hash table with + * ZAP_LEAF_HASH_NUMENTRIES(zap) entries. The hash table is + * followed by an array of ZAP_LEAF_NUMCHUNKS(zap) + * zap_leaf_chunk structures. These structures are accessed + * with the ZAP_LEAF_CHUNK() macro. + */ + + uint16_t l_hash[1]; +} zap_leaf_phys_t; + +typedef union zap_leaf_chunk { + struct zap_leaf_entry { + uint8_t le_type; /* always ZAP_CHUNK_ENTRY */ + uint8_t le_int_size; /* size of ints */ + uint16_t le_next; /* next entry in hash chain */ + uint16_t le_name_chunk; /* first chunk of the name */ + uint16_t le_name_length; /* bytes in name, incl null */ + uint16_t le_value_chunk; /* first chunk of the value */ + uint16_t le_value_length; /* value length in ints */ + uint32_t le_cd; /* collision differentiator */ + uint64_t le_hash; /* hash value of the name */ + } l_entry; + struct zap_leaf_array { + uint8_t la_type; /* always ZAP_CHUNK_ARRAY */ + union { + uint8_t la_array[ZAP_LEAF_ARRAY_BYTES]; + uint64_t la_array64; + } __attribute__ ((packed)); + uint16_t la_next; /* next blk or CHAIN_END */ + } l_array; + struct zap_leaf_free { + uint8_t lf_type; /* always ZAP_CHUNK_FREE */ + uint8_t lf_pad[ZAP_LEAF_ARRAY_BYTES]; + uint16_t lf_next; /* next in free list, or CHAIN_END */ + } l_free; +} zap_leaf_chunk_t; + +#endif /* _SYS_ZAP_LEAF_H */ diff --git a/include/zfs/zfs.h b/include/zfs/zfs.h new file mode 100644 index 0000000..b6d41c0 --- /dev/null +++ b/include/zfs/zfs.h @@ -0,0 +1,122 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + /* + * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef GRUB_ZFS_HEADER +#define GRUB_ZFS_HEADER 1 + + +/* + * On-disk version number. + */ +#define SPA_VERSION 28ULL + +/* + * The following are configuration names used in the nvlist describing a pool's + * configuration. + */ +#define ZPOOL_CONFIG_VERSION "version" +#define ZPOOL_CONFIG_POOL_NAME "name" +#define ZPOOL_CONFIG_POOL_STATE "state" +#define ZPOOL_CONFIG_POOL_TXG "txg" +#define ZPOOL_CONFIG_POOL_GUID "pool_guid" +#define ZPOOL_CONFIG_CREATE_TXG "create_txg" +#define ZPOOL_CONFIG_TOP_GUID "top_guid" +#define ZPOOL_CONFIG_VDEV_TREE "vdev_tree" +#define ZPOOL_CONFIG_TYPE "type" +#define ZPOOL_CONFIG_CHILDREN "children" +#define ZPOOL_CONFIG_ID "id" +#define ZPOOL_CONFIG_GUID "guid" +#define ZPOOL_CONFIG_PATH "path" +#define ZPOOL_CONFIG_DEVID "devid" +#define ZPOOL_CONFIG_METASLAB_ARRAY "metaslab_array" +#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift" +#define ZPOOL_CONFIG_ASHIFT "ashift" +#define ZPOOL_CONFIG_ASIZE "asize" +#define ZPOOL_CONFIG_DTL "DTL" +#define ZPOOL_CONFIG_STATS "stats" +#define ZPOOL_CONFIG_WHOLE_DISK "whole_disk" +#define ZPOOL_CONFIG_ERRCOUNT "error_count" +#define ZPOOL_CONFIG_NOT_PRESENT "not_present" +#define ZPOOL_CONFIG_SPARES "spares" +#define ZPOOL_CONFIG_IS_SPARE "is_spare" +#define ZPOOL_CONFIG_NPARITY "nparity" +#define ZPOOL_CONFIG_PHYS_PATH "phys_path" +#define ZPOOL_CONFIG_L2CACHE "l2cache" +#define ZPOOL_CONFIG_HOLE_ARRAY "hole_array" +#define ZPOOL_CONFIG_VDEV_CHILDREN "vdev_children" +#define ZPOOL_CONFIG_IS_HOLE "is_hole" +#define ZPOOL_CONFIG_DDT_HISTOGRAM "ddt_histogram" +#define ZPOOL_CONFIG_DDT_OBJ_STATS "ddt_object_stats" +#define ZPOOL_CONFIG_DDT_STATS "ddt_stats" +/* + * The persistent vdev state is stored as separate values rather than a single + * 'vdev_state' entry. This is because a device can be in multiple states, such + * as offline and degraded. + */ +#define ZPOOL_CONFIG_OFFLINE "offline" +#define ZPOOL_CONFIG_FAULTED "faulted" +#define ZPOOL_CONFIG_DEGRADED "degraded" +#define ZPOOL_CONFIG_REMOVED "removed" + +#define VDEV_TYPE_ROOT "root" +#define VDEV_TYPE_MIRROR "mirror" +#define VDEV_TYPE_REPLACING "replacing" +#define VDEV_TYPE_RAIDZ "raidz" +#define VDEV_TYPE_DISK "disk" +#define VDEV_TYPE_FILE "file" +#define VDEV_TYPE_MISSING "missing" +#define VDEV_TYPE_HOLE "hole" +#define VDEV_TYPE_SPARE "spare" +#define VDEV_TYPE_L2CACHE "l2cache" + +/* + * pool state. The following states are written to disk as part of the normal + * SPA lifecycle: ACTIVE, EXPORTED, DESTROYED, SPARE, L2CACHE. The remaining + * states are software abstractions used at various levels to communicate pool + * state. + */ +typedef enum pool_state { + POOL_STATE_ACTIVE = 0, /* In active use */ + POOL_STATE_EXPORTED, /* Explicitly exported */ + POOL_STATE_DESTROYED, /* Explicitly destroyed */ + POOL_STATE_SPARE, /* Reserved for hot spare use */ + POOL_STATE_L2CACHE, /* Level 2 ARC device */ + POOL_STATE_UNINITIALIZED, /* Internal spa_t state */ + POOL_STATE_UNAVAIL, /* Internal libzfs state */ + POOL_STATE_POTENTIALLY_ACTIVE /* Internal libzfs state */ +} pool_state_t; + +struct grub_zfs_data; + +int grub_zfs_fetch_nvlist(device_t dev, char **nvlist); +int grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj); + +char *grub_zfs_nvlist_lookup_string(char *nvlist, char *name); +char *grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name); +int grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, + uint64_t *out); +char *grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index); +int grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name); + +#endif /* ! GRUB_ZFS_HEADER */ diff --git a/include/zfs/zfs_acl.h b/include/zfs/zfs_acl.h new file mode 100644 index 0000000..66749af --- /dev/null +++ b/include/zfs/zfs_acl.h @@ -0,0 +1,55 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ACL_H +#define _SYS_FS_ZFS_ACL_H + +typedef struct zfs_oldace { + uint32_t z_fuid; /* "who" */ + uint32_t z_access_mask; /* access mask */ + uint16_t z_flags; /* flags, i.e inheritance */ + uint16_t z_type; /* type of entry allow/deny */ +} zfs_oldace_t; + +#define ACE_SLOT_CNT 6 + +typedef struct zfs_znode_acl_v0 { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_count; /* Number of ACEs */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_pad; /* pad */ + zfs_oldace_t z_ace_data[ACE_SLOT_CNT]; /* 6 standard ACEs */ +} zfs_znode_acl_v0_t; + +#define ZFS_ACE_SPACE (sizeof(zfs_oldace_t) * ACE_SLOT_CNT) + +typedef struct zfs_znode_acl { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_size; /* Number of bytes in ACL */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_count; /* ace count */ + uint8_t z_ace_data[ZFS_ACE_SPACE]; /* space for embedded ACEs */ +} zfs_znode_acl_t; + + +#endif /* _SYS_FS_ZFS_ACL_H */ diff --git a/include/zfs/zfs_znode.h b/include/zfs/zfs_znode.h new file mode 100644 index 0000000..e3265e3 --- /dev/null +++ b/include/zfs/zfs_znode.h @@ -0,0 +1,70 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ZNODE_H +#define _SYS_FS_ZFS_ZNODE_H + +#include <zfs/zfs_acl.h> + +#define MASTER_NODE_OBJ 1 +#define ZFS_ROOT_OBJ "ROOT" +#define ZPL_VERSION_STR "VERSION" +#define ZFS_SA_ATTRS "SA_ATTRS" + +#define ZPL_VERSION 5ULL + +#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48) + +/* + * This is the persistent portion of the znode. It is stored + * in the "bonus buffer" of the file. Short symbolic links + * are also stored in the bonus buffer. + */ +typedef struct znode_phys { + uint64_t zp_atime[2]; /* 0 - last file access time */ + uint64_t zp_mtime[2]; /* 16 - last file modification time */ + uint64_t zp_ctime[2]; /* 32 - last file change time */ + uint64_t zp_crtime[2]; /* 48 - creation time */ + uint64_t zp_gen; /* 64 - generation (txg of creation) */ + uint64_t zp_mode; /* 72 - file mode bits */ + uint64_t zp_size; /* 80 - size of file */ + uint64_t zp_parent; /* 88 - directory parent (`..') */ + uint64_t zp_links; /* 96 - number of links to file */ + uint64_t zp_xattr; /* 104 - DMU object for xattrs */ + uint64_t zp_rdev; /* 112 - dev_t for VBLK & VCHR files */ + uint64_t zp_flags; /* 120 - persistent flags */ + uint64_t zp_uid; /* 128 - file owner */ + uint64_t zp_gid; /* 136 - owning group */ + uint64_t zp_pad[4]; /* 144 - future */ + zfs_znode_acl_t zp_acl; /* 176 - 263 ACL */ + /* + * Data may pad out any remaining bytes in the znode buffer, eg: + * + * |<---------------------- dnode_phys (512) ------------------------>| + * |<-- dnode (192) --->|<----------- "bonus" buffer (320) ---------->| + * |<---- znode (264) ---->|<---- data (56) ---->| + * + * At present, we only use this space to store symbolic links. + */ +} znode_phys_t; + +#endif /* _SYS_FS_ZFS_ZNODE_H */ diff --git a/include/zfs/zil.h b/include/zfs/zil.h new file mode 100644 index 0000000..bc9d5e9 --- /dev/null +++ b/include/zfs/zil.h @@ -0,0 +1,56 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIL_H +#define _SYS_ZIL_H + +/* + * Intent log format: + * + * Each objset has its own intent log. The log header (zil_header_t) + * for objset N's intent log is kept in the Nth object of the SPA's + * intent_log objset. The log header points to a chain of log blocks, + * each of which contains log records (i.e., transactions) followed by + * a log block trailer (zil_trailer_t). The format of a log record + * depends on the record (or transaction) type, but all records begin + * with a common structure that defines the type, length, and txg. + */ + +/* + * Intent log header - this on disk structure holds fields to manage + * the log. All fields are 64 bit to easily handle cross architectures. + */ +typedef struct zil_header { + uint64_t zh_claim_txg; /* txg in which log blocks were claimed */ + uint64_t zh_replay_seq; /* highest replayed sequence number */ + blkptr_t zh_log; /* log chain */ + uint64_t zh_claim_seq; /* highest claimed sequence number */ + uint64_t zh_flags; /* header flags */ + uint64_t zh_pad[4]; +} zil_header_t; + +/* + * zh_flags bit settings + */ +#define ZIL_REPLAY_NEEDED 0x1 /* replay needed - internal only */ + +#endif /* _SYS_ZIL_H */ diff --git a/include/zfs/zio.h b/include/zfs/zio.h new file mode 100644 index 0000000..38f90d5 --- /dev/null +++ b/include/zfs/zio.h @@ -0,0 +1,92 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _ZIO_H +#define _ZIO_H + +#include <zfs/spa.h> + +#define ZEC_MAGIC 0x210da7ab10c7a11ULL /* zio data bloc tail */ + +typedef struct zio_eck { + uint64_t zec_magic; /* for validation, endianness */ + zio_cksum_t zec_cksum; /* 256-bit checksum */ +} zio_eck_t; + +/* + * Gang block headers are self-checksumming and contain an array + * of block pointers. + */ +#define SPA_GANGBLOCKSIZE SPA_MINBLOCKSIZE +#define SPA_GBH_NBLKPTRS ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t)) / sizeof(blkptr_t)) +#define SPA_GBH_FILLER ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t) - \ + (SPA_GBH_NBLKPTRS * sizeof(blkptr_t))) /\ + sizeof(uint64_t)) + +#define ZIO_GET_IOSIZE(zio) \ + (BP_IS_GANG((zio)->io_bp) ? \ + SPA_GANGBLOCKSIZE : BP_GET_PSIZE((zio)->io_bp)) + +typedef struct zio_gbh { + blkptr_t zg_blkptr[SPA_GBH_NBLKPTRS]; + uint64_t zg_filler[SPA_GBH_FILLER]; + zio_eck_t zg_tail; +} zio_gbh_phys_t; + +enum zio_checksum { + ZIO_CHECKSUM_INHERIT = 0, + ZIO_CHECKSUM_ON, + ZIO_CHECKSUM_OFF, + ZIO_CHECKSUM_LABEL, + ZIO_CHECKSUM_GANG_HEADER, + ZIO_CHECKSUM_ZILOG, + ZIO_CHECKSUM_FLETCHER_2, + ZIO_CHECKSUM_FLETCHER_4, + ZIO_CHECKSUM_SHA256, + ZIO_CHECKSUM_ZILOG2, + ZIO_CHECKSUM_FUNCTIONS +}; + +#define ZIO_CHECKSUM_ON_VALUE ZIO_CHECKSUM_FLETCHER_2 +#define ZIO_CHECKSUM_DEFAULT ZIO_CHECKSUM_ON + +enum zio_compress { + ZIO_COMPRESS_INHERIT = 0, + ZIO_COMPRESS_ON, + ZIO_COMPRESS_OFF, + ZIO_COMPRESS_LZJB, + ZIO_COMPRESS_EMPTY, + ZIO_COMPRESS_GZIP1, + ZIO_COMPRESS_GZIP2, + ZIO_COMPRESS_GZIP3, + ZIO_COMPRESS_GZIP4, + ZIO_COMPRESS_GZIP5, + ZIO_COMPRESS_GZIP6, + ZIO_COMPRESS_GZIP7, + ZIO_COMPRESS_GZIP8, + ZIO_COMPRESS_GZIP9, + ZIO_COMPRESS_FUNCTIONS +}; + +#endif /* _ZIO_H */ diff --git a/include/zfs/zio_checksum.h b/include/zfs/zio_checksum.h new file mode 100644 index 0000000..8ade44a --- /dev/null +++ b/include/zfs/zio_checksum.h @@ -0,0 +1,49 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIO_CHECKSUM_H +#define _SYS_ZIO_CHECKSUM_H + +/* + * Signature for checksum functions. + */ +typedef void zio_checksum_t(const void *data, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp); + +/* + * Information about each checksum function. + */ +typedef struct zio_checksum_info { + zio_checksum_t *ci_func; /* checksum function for each byteorder */ + int ci_correctable; /* number of correctable bits */ + int ci_eck; /* uses zio embedded checksum? */ + char *ci_name; /* descriptive name */ +} zio_checksum_info_t; + +extern void zio_checksum_SHA256(const void *, uint64_t, + grub_zfs_endian_t endian, zio_cksum_t *); +extern void fletcher_2(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); +extern void fletcher_4(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); + +#endif /* _SYS_ZIO_CHECKSUM_H */

Dear Jorgen Lundman,
In message 1337744719-27487-2-git-send-email-lundman@lundman.net you wrote:
Signed-off-by: Jorgen Lundman lundman@lundman.net
Please provide a useful commit message.
Also, this being a V2 patch, you are supposed to provide a history of changes. Please see http://www.denx.de/wiki/view/U-Boot/Patches#Sending_updated_patch_versions
Finally, it makes little sense to add the headers separately. Please squash both patches into on.
And please add information where _exactly_ the code has been copied from; for an example how to do this please see http://www.denx.de/wiki/view/U-Boot/Patches#Attributing_Code_Copyrights_Sign i. e. pleaseinclude information about download resp. reposity URL plus exact version that was used as base for this work.
Thanks.
Best regards,
Wolfgang Denk

In message1337744719-27487-2-git-send-email-lundman@lundman.net you wrote:
Signed-off-by: Jorgen Lundmanlundman@lundman.net
Please provide a useful commit message.
Graeme has been schooling me on this and I believe I know what to do in V3. He also recommend I wait a day or so, so that others can't have a chance to comment.
Finally, it makes little sense to add the headers separately. Please squash both patches into on.
The wiki mention a strict 100KB limit per mail, so I made 2 patches to avoid it. The headers split was just an arbitrary logical split I made.
And please add information where _exactly_ the code has been copied from; for an example how to do this please see http://www.denx.de/wiki/view/U-Boot/Patches#Attributing_Code_Copyrights_Sign i. e. pleaseinclude information about download resp. reposity URL plus exact version that was used as base for this work.
Understood.

Signed-off-by: Jorgen Lundman lundman@lundman.net --- Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs_common.h | 94 ++ 12 files changed, 3215 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 351a8f0..d3b84bf 100644 --- a/Makefile +++ b/Makefile @@ -244,7 +244,7 @@ endif LIBS += arch/$(ARCH)/lib/lib$(ARCH).o LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \ fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \ - fs/ubifs/libubifs.o + fs/ubifs/libubifs.o fs/zfs/libzfs.o LIBS += net/libnet.o LIBS += disk/libdisk.o LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 6e23baa..4de03da 100644 --- a/common/Makefile +++ b/common/Makefile @@ -164,6 +164,7 @@ COBJS-$(CONFIG_USB_STORAGE) += usb_storage.o endif COBJS-$(CONFIG_CMD_XIMG) += cmd_ximg.o COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
# others diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..a6ea2c0 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,236 @@ +/* + * + * ZFS filesystem porting to Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + * + */ + +#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h> + +#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif + +#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif + +#define DOS_PART_MAGIC_OFFSET 0x1fe +#define DOS_FS_TYPE_OFFSET 0x36 +#define DOS_FS32_TYPE_OFFSET 0x52 + +static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + char *filename = NULL; + char *ep; + int dev; + unsigned long part = 1; + ulong addr = 0; + ulong part_length; + disk_partition_t info; + char buf[12]; + unsigned long count; + const char *addr_str; + struct zfs_file zfile; + struct device_s vdev; + + if (argc < 3) + return CMD_RET_USAGE; + + count = 0; + addr = simple_strtoul(argv[3], NULL, 16); + filename = getenv("bootfile"); + switch (argc) { + case 3: + addr_str = getenv("loadaddr"); + if (addr_str != NULL) + addr = simple_strtoul(addr_str, NULL, 16); + else + addr = CONFIG_SYS_LOAD_ADDR; + + break; + case 4: + break; + case 5: + filename = argv[4]; + break; + case 6: + filename = argv[4]; + count = simple_strtoul(argv[5], NULL, 16); + break; + + default: + return cmd_usage(cmdtp); + } + + if (!filename) { + puts("** No boot file defined **\n"); + return 1; + } + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + if (zfs_dev_desc == NULL) { + printf("** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (part != 0) { + if (get_partition_info(zfs_dev_desc, part, &info)) { + printf("** Bad partition %lu **\n", part); + return 1; + } + + if (strncmp((char *)info.type, BOOT_PART_TYPE, + strlen(BOOT_PART_TYPE)) != 0) { + printf("** Invalid partition type "%s" (expect "" BOOT_PART_TYPE "")\n", + info.type); + return 1; + } + printf("Loading file "%s" " + "from %s device %d:%lu %s\n", + filename, argv[1], dev, part, info.name); + } else { + printf("Loading file "%s" from %s device %d\n", + filename, argv[1], dev); + } + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("**Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + memset(&zfile, 0, sizeof(zfile)); + zfile.device = &vdev; + if (zfs_open(&zfile, filename)) { + printf("** File not found %s\n", filename); + return 1; + } + + if ((count < zfile.size) && (count != 0)) + zfile.size = (uint64_t)count; + + if (zfs_read(&zfile, (char *)addr, zfile.size) != zfile.size) { + printf("** Unable to read "%s" from %s %d:%lu **\n", + filename, argv[1], dev, part); + zfs_close(&zfile); + return 1; + } + + zfs_close(&zfile); + + /* Loading ok, update default load address */ + load_addr = addr; + + printf("%llu bytes read\n", zfile.size); + sprintf(buf, "%llX", zfile.size); + setenv("filesize", buf); + + return 0; +} + + +int zfs_print(const char *entry, const struct zfs_dirhook_info *data) +{ + printf("%s %s\n", + data->dir ? "<DIR> " : " ", + entry); + return 0; /* 0 continue, 1 stop */ +} + + + +static int do_zfs_ls(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + const char *filename = "/"; + int dev; + unsigned long part = 1; + char *ep; + int part_length; + struct device_s vdev; + + if (argc < 3) + return cmd_usage(cmdtp); + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + + if (zfs_dev_desc == NULL) { + printf("\n** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("\n** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (argc == 4) + filename = argv[3]; + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("** Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + zfs_ls(&vdev, filename, + zfs_print); + + return 0; +} + + +U_BOOT_CMD(zfsls, 4, 1, do_zfs_ls, + "list files in a directory (default /)", + "<interface> <dev[:part]> [directory]\n" + " - list files from 'dev' on 'interface' in a '/DATASET/@/$dir/'"); + +U_BOOT_CMD(zfsload, 6, 0, do_zfs_load, + "load binary file from a ZFS filesystem", + "<interface> <dev[:part]> [addr] [filename] [bytes]\n" + " - load binary file '/DATASET/@/$dir/$file' from 'dev' on 'interface'\n" + " to address 'addr' from ZFS filesystem"); diff --git a/fs/Makefile b/fs/Makefile index 22aad12..baa7e96 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -1,6 +1,6 @@ # # (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# Wolfgang Denk, DENX Software Engineering, <wd at denx.de> # # See file CREDITS for list of people who contributed to this # project. @@ -30,6 +30,7 @@ subdirs-$(CONFIG_CMD_JFFS2) += jffs2 subdirs-$(CONFIG_CMD_REISER) += reiserfs subdirs-$(CONFIG_YAFFS2) += yaffs2 subdirs-$(CONFIG_CMD_UBIFS) += ubifs +subdirs-$(CONFIG_CMD_ZFS) += zfs
SUBDIRS := $(subdirs-y)
diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 56% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..938fc5e 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,6 @@ # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2012 +# Jorgen Lundman <lundman at lundman.net> # # See file CREDITS for list of people who contributed to this # project. @@ -20,19 +20,28 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, # MA 02111-1307 USA # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk + +LIB = $(obj)libzfs.o + +AOBJS = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o + +SRCS := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS := $(addprefix $(obj),$(AOBJS) $(COBJS-y)) + + +all: $(LIB) $(AOBJS) + +$(LIB): $(obj).depend $(OBJS) + $(call cmd_link_o_target, $(OBJS)) + +######################################################################### + +# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all: - @for dir in $(SUBDIRS) ; do \ - $(MAKE) -C $$dir $@ ; done +######################################################################### diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..ab32865 --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,137 @@ +/* + * + * based on code of fs/reiserfs/dev.c by + * + * (C) Copyright 2003 - 2004 + * Sysgo AG, <www.elinos.com>, Pavel Bartusek pba@sysgo.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <config.h> +#include <zfs_common.h> + +static block_dev_desc_t *zfs_block_dev_desc; +static disk_partition_t part_info; + +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{ + zfs_block_dev_desc = rbdd; + + if (part == 0) { + /* disk doesn't use partition table */ + part_info.start = 0; + part_info.size = rbdd->lba; + part_info.blksz = rbdd->blksz; + } else { + if (get_partition_info(zfs_block_dev_desc, part, &part_info)) + return 0; + } + + return part_info.size; +} + +/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{ + short sec_buffer[SECTOR_SIZE/sizeof(short)]; + char *sec_buf = sec_buffer; + unsigned block_len; + + /* + * Check partition boundaries + */ + if ((sector < 0) || + ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >= + part_info.size)) { + /* errnum = ERR_OUTSIDE_PART; */ + printf(" ** zfs_devread() read outside partition sector %d\n", sector); + return 1; + } + + /* + * Get the read to the beginning of a partition. + */ + sector += byte_offset >> SECTOR_BITS; + byte_offset &= SECTOR_SIZE - 1; + + debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len); + + if (zfs_block_dev_desc == NULL) { + printf("** Invalid Block Device Descriptor (NULL)\n"); + return 1; + } + + if (byte_offset != 0) { + /* read first part which isn't aligned with start of sector */ + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error **\n"); + return 1; + } + memcpy(buf, sec_buf + byte_offset, + min(SECTOR_SIZE - byte_offset, byte_len)); + buf += min(SECTOR_SIZE - byte_offset, byte_len); + byte_len -= min(SECTOR_SIZE - byte_offset, byte_len); + sector++; + } + + if (byte_len == 0) + return 0; + + /* read sector aligned part */ + block_len = byte_len & ~(SECTOR_SIZE - 1); + + if (block_len == 0) { + u8 p[SECTOR_SIZE]; + + block_len = SECTOR_SIZE; + zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + 1, (unsigned long *)p); + memcpy(buf, p, byte_len); + return 0; + } + + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + block_len / SECTOR_SIZE, + (unsigned long *) buf) != + block_len / SECTOR_SIZE) { + printf(" ** zfs_devread() read error - block\n"); + return 1; + } + + block_len = byte_len & ~(SECTOR_SIZE - 1); + buf += block_len; + byte_len -= block_len; + sector += block_len / SECTOR_SIZE; + + if (byte_len != 0) { + /* read rest of data which are not in whole sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error - last part\n"); + return 1; + } + memcpy(buf, sec_buf, byte_len); + } + return 0; +} diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..d6e0e23 --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2396 @@ +/* + * + * ZFS filesystem ported to u-boot by + * Jorgen Lundman <lundman at lundman.net> + * + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * Copyright 2004 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + * + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +block_dev_desc_t *zfs_dev_desc; + +/* + * The zfs plug-in routines for GRUB are: + * + * zfs_mount() - locates a valid uberblock of the root pool and reads + * in its MOS at the memory address MOS. + * + * zfs_open() - locates a plain file object by following the MOS + * and places its dnode at the memory address DNODE. + * + * zfs_read() - read in the data blocks pointed by the DNODE. + * + */ + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/sa_impl.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + + +#define ZPOOL_PROP_BOOTFS "bootfs" + + +/* + * For nvlist manipulation. (from nvpair.h) + */ +#define NV_ENCODE_NATIVE 0 +#define NV_ENCODE_XDR 1 +#define NV_BIG_ENDIAN 0 +#define NV_LITTLE_ENDIAN 1 +#define DATA_TYPE_UINT64 8 +#define DATA_TYPE_STRING 9 +#define DATA_TYPE_NVLIST 19 +#define DATA_TYPE_NVLIST_ARRAY 20 + + +/* + * Macros to get fields in a bp or DVA. + */ +#define P2PHASE(x, align) ((x) & ((align) - 1)) +#define DVA_OFFSET_TO_PHYS_SECTOR(offset) \ + ((offset + VDEV_LABEL_START_SIZE) >> SPA_MINBLOCKSHIFT) + +/* + * return x rounded down to an align boundary + * eg, P2ALIGN(1200, 1024) == 1024 (1*align) + * eg, P2ALIGN(1024, 1024) == 1024 (1*align) + * eg, P2ALIGN(0x1234, 0x100) == 0x1200 (0x12*align) + * eg, P2ALIGN(0x5600, 0x100) == 0x5600 (0x56*align) + */ +#define P2ALIGN(x, align) ((x) & -(align)) + +/* + * FAT ZAP data structures + */ +#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ +#define ZAP_HASH_IDX(hash, n) (((n) == 0) ? 0 : ((hash) >> (64 - (n)))) +#define CHAIN_END 0xffff /* end of the chunk chain */ + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +#define ZAP_LEAF_HASH_SHIFT(bs) (bs - 5) +#define ZAP_LEAF_HASH_NUMENTRIES(bs) (1 << ZAP_LEAF_HASH_SHIFT(bs)) +#define LEAF_HASH(bs, h) \ + ((ZAP_LEAF_HASH_NUMENTRIES(bs)-1) & \ + ((h) >> (64 - ZAP_LEAF_HASH_SHIFT(bs)-l->l_hdr.lh_prefix_len))) + +/* + * The amount of space available for chunks is: + * block size shift - hash entry size (2) * number of hash + * entries - header space (2*chunksize) + */ +#define ZAP_LEAF_NUMCHUNKS(bs) \ + (((1<<bs) - 2*ZAP_LEAF_HASH_NUMENTRIES(bs)) / \ + ZAP_LEAF_CHUNKSIZE - 2) + +/* + * The chunks start immediately after the hash table. The end of the + * hash table is at l_hash + HASH_NUMENTRIES, which we simply cast to a + * chunk_t. + */ +#define ZAP_LEAF_CHUNK(l, bs, idx) \ + ((zap_leaf_chunk_t *)(l->l_hash + ZAP_LEAF_HASH_NUMENTRIES(bs)))[idx] +#define ZAP_LEAF_ENTRY(l, bs, idx) (&ZAP_LEAF_CHUNK(l, bs, idx).l_entry) + + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + + + +typedef int zfs_decomp_func_t(void *s_start, void *d_start, + uint32_t s_len, uint32_t d_len); +typedef struct decomp_entry { + char *name; + zfs_decomp_func_t *decomp_func; +} decomp_entry_t; + +typedef struct dnode_end { + dnode_phys_t dn; + grub_zfs_endian_t endian; +} dnode_end_t; + +struct grub_zfs_data { + /* cache for a file block of the currently zfs_open()-ed file */ + char *file_buf; + uint64_t file_start; + uint64_t file_end; + + /* XXX: ashift is per vdev, not per pool. We currently only ever touch + * a single vdev, but when/if raid-z or stripes are supported, this + * may need revision. + */ + uint64_t vdev_ashift; + uint64_t label_txg; + uint64_t pool_guid; + + /* cache for a dnode block */ + dnode_phys_t *dnode_buf; + dnode_phys_t *dnode_mdn; + uint64_t dnode_start; + uint64_t dnode_end; + grub_zfs_endian_t dnode_endian; + + uberblock_t current_uberblock; + + dnode_end_t mos; + dnode_end_t mdn; + dnode_end_t dnode; + + uint64_t vdev_phys_sector; + + int (*userhook)(const char *, const struct zfs_dirhook_info *); + struct zfs_dirhook_info *dirinfo; + +}; + + + + +static int +zlib_decompress(void *s, void *d, + uint32_t slen, uint32_t dlen) +{ + if (zlib_decompress(s, d, slen, dlen) < 0) + return ZFS_ERR_BAD_FS; + return ZFS_ERR_NONE; +} + +static decomp_entry_t decomp_table[ZIO_COMPRESS_FUNCTIONS] = { + {"inherit", NULL}, /* ZIO_COMPRESS_INHERIT */ + {"on", lzjb_decompress}, /* ZIO_COMPRESS_ON */ + {"off", NULL}, /* ZIO_COMPRESS_OFF */ + {"lzjb", lzjb_decompress}, /* ZIO_COMPRESS_LZJB */ + {"empty", NULL}, /* ZIO_COMPRESS_EMPTY */ + {"gzip-1", zlib_decompress}, /* ZIO_COMPRESS_GZIP1 */ + {"gzip-2", zlib_decompress}, /* ZIO_COMPRESS_GZIP2 */ + {"gzip-3", zlib_decompress}, /* ZIO_COMPRESS_GZIP3 */ + {"gzip-4", zlib_decompress}, /* ZIO_COMPRESS_GZIP4 */ + {"gzip-5", zlib_decompress}, /* ZIO_COMPRESS_GZIP5 */ + {"gzip-6", zlib_decompress}, /* ZIO_COMPRESS_GZIP6 */ + {"gzip-7", zlib_decompress}, /* ZIO_COMPRESS_GZIP7 */ + {"gzip-8", zlib_decompress}, /* ZIO_COMPRESS_GZIP8 */ + {"gzip-9", zlib_decompress}, /* ZIO_COMPRESS_GZIP9 */ +}; + + + +static int zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, + void *buf, struct grub_zfs_data *data); + +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data); + +/* + * Our own version of log2(). Same thing as highbit()-1. + */ +static int +zfs_log2(uint64_t num) +{ + int i = 0; + + while (num > 1) { + i++; + num = num >> 1; + } + + return i; +} + + +/* Checksum Functions */ +static void +zio_checksum_off(const void *buf __attribute__ ((unused)), + uint64_t size __attribute__ ((unused)), + grub_zfs_endian_t endian __attribute__ ((unused)), + zio_cksum_t *zcp) +{ + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +/* Checksum Table and Values */ +static zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = { + {NULL, 0, 0, "inherit"}, + {NULL, 0, 0, "on"}, + {zio_checksum_off, 0, 0, "off"}, + {zio_checksum_SHA256, 1, 1, "label"}, + {zio_checksum_SHA256, 1, 1, "gang_header"}, + {NULL, 0, 0, "zilog"}, + {fletcher_2, 0, 0, "fletcher2"}, + {fletcher_4, 1, 0, "fletcher4"}, + {zio_checksum_SHA256, 1, 0, "SHA256"}, + {NULL, 0, 0, "zilog2"}, +}; + +/* + * zio_checksum_verify: Provides support for checksum verification. + * + * Fletcher2, Fletcher4, and SHA256 are supported. + * + */ +static int +zio_checksum_verify(zio_cksum_t zc, uint32_t checksum, + grub_zfs_endian_t endian, char *buf, int size) +{ + zio_eck_t *zec = (zio_eck_t *) (buf + size) - 1; + zio_checksum_info_t *ci = &zio_checksum_table[checksum]; + zio_cksum_t actual_cksum, expected_cksum; + + if (checksum >= ZIO_CHECKSUM_FUNCTIONS || ci->ci_func == NULL) { + printf("zfs unknown checksum function %d\n", checksum); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (ci->ci_eck) { + expected_cksum = zec->zec_cksum; + zec->zec_cksum = zc; + ci->ci_func(buf, size, endian, &actual_cksum); + zec->zec_cksum = expected_cksum; + zc = expected_cksum; + } else { + ci->ci_func(buf, size, endian, &actual_cksum); + } + + if ((actual_cksum.zc_word[0] != zc.zc_word[0]) + || (actual_cksum.zc_word[1] != zc.zc_word[1]) + || (actual_cksum.zc_word[2] != zc.zc_word[2]) + || (actual_cksum.zc_word[3] != zc.zc_word[3])) { + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * vdev_uberblock_compare takes two uberblock structures and returns an integer + * indicating the more recent of the two. + * Return Value = 1 if ub2 is more recent + * Return Value = -1 if ub1 is more recent + * The most recent uberblock is determined using its transaction number and + * timestamp. The uberblock with the highest transaction number is + * considered "newer". If the transaction numbers of the two blocks match, the + * timestamps are compared to determine the "newer" of the two. + */ +static int +vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +{ + grub_zfs_endian_t ub1_endian, ub2_endian; + if (grub_zfs_to_cpu64(ub1->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub1_endian = LITTLE_ENDIAN; + else + ub1_endian = BIG_ENDIAN; + if (grub_zfs_to_cpu64(ub2->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub2_endian = LITTLE_ENDIAN; + else + ub2_endian = BIG_ENDIAN; + + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return 1; + + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return 1; + + return 0; +} + +/* + * Three pieces of information are needed to verify an uberblock: the magic + * number, the version number, and the checksum. + * + * Currently Implemented: version number, magic number, label txg + * Need to Implement: checksum + * + */ +static int +uberblock_verify(uberblock_t *uber, int offset, struct grub_zfs_data *data) +{ + int err; + grub_zfs_endian_t endian = UNKNOWN_ENDIAN; + zio_cksum_t zc; + + if (uber->ub_txg < data->label_txg) { + debug("ignoring partially written label: uber_txg < label_txg %llu %llu\n", + uber->ub_txg, data->label_txg); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(uber->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) <= SPA_VERSION) + endian = LITTLE_ENDIAN; + + if (grub_zfs_to_cpu64(uber->ub_magic, BIG_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) <= SPA_VERSION) + endian = BIG_ENDIAN; + + if (endian == UNKNOWN_ENDIAN) { + printf("invalid uberblock magic\n"); + return ZFS_ERR_BAD_FS; + } + + memset(&zc, 0, sizeof(zc)); + zc.zc_word[0] = grub_cpu_to_zfs64(offset, endian); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_LABEL, endian, + (char *) uber, UBERBLOCK_SIZE(data->vdev_ashift)); + + if (!err) { + /* Check that the data pointed by the rootbp is usable. */ + void *osp = NULL; + size_t ospsize; + err = zio_read(&uber->ub_rootbp, endian, &osp, &ospsize, data); + free(osp); + + if (!err && ospsize < OBJSET_PHYS_SIZE_V14) { + printf("uberblock rootbp points to invalid data\n"); + return ZFS_ERR_BAD_FS; + } + } + + return err; +} + +/* + * Find the best uberblock. + * Return: + * Success - Pointer to the best uberblock. + * Failure - NULL + */ +static uberblock_t *find_bestub(char *ub_array, struct grub_zfs_data *data) +{ + const uint64_t sector = data->vdev_phys_sector; + uberblock_t *ubbest = NULL; + uberblock_t *ubnext; + unsigned int i, offset, pickedub = 0; + int err = ZFS_ERR_NONE; + + const unsigned int UBCOUNT = UBERBLOCK_COUNT(data->vdev_ashift); + const uint64_t UBBYTES = UBERBLOCK_SIZE(data->vdev_ashift); + + for (i = 0; i < UBCOUNT; i++) { + ubnext = (uberblock_t *) (i * UBBYTES + ub_array); + offset = (sector << SPA_MINBLOCKSHIFT) + VDEV_PHYS_SIZE + (i * UBBYTES); + + err = uberblock_verify(ubnext, offset, data); + if (err) + continue; + + if (ubbest == NULL || vdev_uberblock_compare(ubnext, ubbest) > 0) { + ubbest = ubnext; + pickedub = i; + } + } + + if (ubbest) + debug("zfs Found best uberblock at idx %d, txg %llu\n", + pickedub, (unsigned long long) ubbest->ub_txg); + + return ubbest; +} + +static inline size_t +get_psize(blkptr_t *bp, grub_zfs_endian_t endian) +{ + return (((grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 16) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT; +} + +static uint64_t +dva_get_offset(dva_t *dva, grub_zfs_endian_t endian) +{ + return grub_zfs_to_cpu64((dva)->dva_word[1], + endian) << SPA_MINBLOCKSHIFT; +} + +/* + * Read a block of data based on the gang block address dva, + * and put its data in buf. + * + */ +static int +zio_read_gang(blkptr_t *bp, grub_zfs_endian_t endian, dva_t *dva, void *buf, + struct grub_zfs_data *data) +{ + zio_gbh_phys_t *zio_gb; + uint64_t offset, sector; + unsigned i; + int err; + zio_cksum_t zc; + + memset(&zc, 0, sizeof(zc)); + + zio_gb = malloc(SPA_GANGBLOCKSIZE); + if (!zio_gb) + return ZFS_ERR_OUT_OF_MEMORY; + + offset = dva_get_offset(dva, endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + /* read in the gang block header */ + err = zfs_devread(sector, 0, SPA_GANGBLOCKSIZE, (char *) zio_gb); + + if (err) { + free(zio_gb); + return err; + } + + /* XXX */ + /* self checksuming the gang block header */ + ZIO_SET_CHECKSUM(&zc, DVA_GET_VDEV(dva), + dva_get_offset(dva, endian), bp->blk_birth, 0); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_GANG_HEADER, endian, + (char *) zio_gb, SPA_GANGBLOCKSIZE); + if (err) { + free(zio_gb); + return err; + } + + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + + for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { + if (zio_gb->zg_blkptr[i].blk_birth == 0) + continue; + + err = zio_read_data(&zio_gb->zg_blkptr[i], endian, buf, data); + if (err) { + free(zio_gb); + return err; + } + buf = (char *) buf + get_psize(&zio_gb->zg_blkptr[i], endian); + } + free(zio_gb); + return ZFS_ERR_NONE; +} + +/* + * Read in a block of raw data to buf. + */ +static int +zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, void *buf, + struct grub_zfs_data *data) +{ + int i, psize; + int err = ZFS_ERR_NONE; + + psize = get_psize(bp, endian); + + /* pick a good dva from the block pointer */ + for (i = 0; i < SPA_DVAS_PER_BP; i++) { + uint64_t offset, sector; + + if (bp->blk_dva[i].dva_word[0] == 0 && bp->blk_dva[i].dva_word[1] == 0) + continue; + + if ((grub_zfs_to_cpu64(bp->blk_dva[i].dva_word[1], endian)>>63) & 1) { + err = zio_read_gang(bp, endian, &bp->blk_dva[i], buf, data); + } else { + /* read in a data block */ + offset = dva_get_offset(&bp->blk_dva[i], endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + err = zfs_devread(sector, 0, psize, buf); + } + + if (!err) { + /*Check the underlying checksum before we rule this DVA as "good"*/ + uint32_t checkalgo = (grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 40) & 0xff; + + err = zio_checksum_verify(bp->blk_cksum, checkalgo, endian, buf, psize); + if (!err) + return ZFS_ERR_NONE; + } + + /* If read failed or checksum bad, reset the error. Hopefully we've got some more DVA's to try.*/ + } + + if (!err) { + printf("couldn't find a valid DVA\n"); + err = ZFS_ERR_BAD_FS; + } + + return err; +} + +/* + * Read in a block of data, verify its checksum, decompress if needed, + * and put the uncompressed data in buf. + */ +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data) +{ + size_t lsize, psize; + unsigned int comp; + char *compbuf = NULL; + int err; + + *buf = NULL; + + comp = (grub_zfs_to_cpu64((bp)->blk_prop, endian)>>32) & 0xff; + lsize = (BP_IS_HOLE(bp) ? 0 : + (((grub_zfs_to_cpu64((bp)->blk_prop, endian) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT)); + psize = get_psize(bp, endian); + + if (size) + *size = lsize; + + if (comp >= ZIO_COMPRESS_FUNCTIONS) { + printf("compression algorithm %u not supported\n", (unsigned int) comp); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF && decomp_table[comp].decomp_func == NULL) { + printf("compression algorithm %s not supported\n", decomp_table[comp].name); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF) { + compbuf = malloc(psize); + if (!compbuf) + return ZFS_ERR_OUT_OF_MEMORY; + } else { + compbuf = *buf = malloc(lsize); + } + + err = zio_read_data(bp, endian, compbuf, data); + if (err) { + free(compbuf); + *buf = NULL; + return err; + } + + if (comp != ZIO_COMPRESS_OFF) { + *buf = malloc(lsize); + if (!*buf) { + free(compbuf); + return ZFS_ERR_OUT_OF_MEMORY; + } + + err = decomp_table[comp].decomp_func(compbuf, *buf, psize, lsize); + free(compbuf); + if (err) { + free(*buf); + *buf = NULL; + return err; + } + } + + return ZFS_ERR_NONE; +} + +/* + * Get the block from a block id. + * push the block onto the stack. + * + */ +static int +dmu_read(dnode_end_t *dn, uint64_t blkid, void **buf, + grub_zfs_endian_t *endian_out, struct grub_zfs_data *data) +{ + int idx, level; + blkptr_t *bp_array = dn->dn.dn_blkptr; + int epbs = dn->dn.dn_indblkshift - SPA_BLKPTRSHIFT; + blkptr_t *bp; + void *tmpbuf = 0; + grub_zfs_endian_t endian; + int err = ZFS_ERR_NONE; + + bp = malloc(sizeof(blkptr_t)); + if (!bp) + return ZFS_ERR_OUT_OF_MEMORY; + + endian = dn->endian; + for (level = dn->dn.dn_nlevels - 1; level >= 0; level--) { + idx = (blkid >> (epbs * level)) & ((1 << epbs) - 1); + *bp = bp_array[idx]; + if (bp_array != dn->dn.dn_blkptr) { + free(bp_array); + bp_array = 0; + } + + if (BP_IS_HOLE(bp)) { + size_t size = grub_zfs_to_cpu16(dn->dn.dn_datablkszsec, + dn->endian) + << SPA_MINBLOCKSHIFT; + *buf = malloc(size); + if (*buf) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + memset(*buf, 0, size); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + if (level == 0) { + err = zio_read(bp, endian, buf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + err = zio_read(bp, endian, &tmpbuf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + if (err) + break; + bp_array = tmpbuf; + } + if (bp_array != dn->dn.dn_blkptr) + free(bp_array); + if (endian_out) + *endian_out = endian; + + free(bp); + return err; +} + +/* + * mzap_lookup: Looks up property described by "name" and returns the value + * in "value". + */ +static int +mzap_lookup(mzap_phys_t *zapobj, grub_zfs_endian_t endian, + int objsize, char *name, uint64_t * value) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (strcmp(mzap_ent[i].mze_name, name) == 0) { + *value = grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian); + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + +static int +mzap_iterate(mzap_phys_t *zapobj, grub_zfs_endian_t endian, int objsize, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (hook(mzap_ent[i].mze_name, + grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian), + data)) + return 1; + } + + return 0; +} + +static uint64_t +zap_hash(uint64_t salt, const char *name) +{ + static uint64_t table[256]; + const uint8_t *cp; + uint8_t c; + uint64_t crc = salt; + + if (table[128] == 0) { + uint64_t *ct; + int i, j; + for (i = 0; i < 256; i++) { + for (ct = table + i, *ct = i, j = 8; j > 0; j--) + *ct = (*ct >> 1) ^ (-(*ct & 1) & ZFS_CRC64_POLY); + } + } + + for (cp = (const uint8_t *) name; (c = *cp) != '\0'; cp++) + crc = (crc >> 8) ^ table[(crc ^ c) & 0xFF]; + + /* + * Only use 28 bits, since we need 4 bits in the cookie for the + * collision differentiator. We MUST use the high bits, since + * those are the onces that we first pay attention to when + * chosing the bucket. + */ + crc &= ~((1ULL << (64 - ZAP_HASHBITS)) - 1); + + return crc; +} + +/* + * Only to be used on 8-bit arrays. + * array_len is actual len in bytes (not encoded le_value_length). + * buf is null-terminated. + */ +/* XXX */ +static int +zap_leaf_array_equal(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, int chunk, int array_len, const char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + return 0; + + if (memcmp(la->la_array, buf + bseen, toread) != 0) + break; + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return (bseen == array_len); +} + +/* XXX */ +static int +zap_leaf_array_get(zap_leaf_phys_t *l, grub_zfs_endian_t endian, int blksft, + int chunk, int array_len, char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + /* Don't use errno because this error is to be ignored. */ + return ZFS_ERR_BAD_FS; + + memcpy(buf + bseen, la->la_array, toread); + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return ZFS_ERR_NONE; +} + + +/* + * Given a zap_leaf_phys_t, walk thru the zap leaf chunks to get the + * value for the property "name". + * + */ +/* XXX */ +static int +zap_leaf_lookup(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, uint64_t h, + const char *name, uint64_t *value) +{ + uint16_t chunk; + struct zap_leaf_entry *le; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + printf("invalid leaf type\n"); + return ZFS_ERR_BAD_FS; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + printf("invalid leaf magic\n"); + return ZFS_ERR_BAD_FS; + } + + for (chunk = grub_zfs_to_cpu16(l->l_hash[LEAF_HASH(blksft, h)], endian); + chunk != CHAIN_END; chunk = le->le_next) { + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) { + printf("invalid chunk number\n"); + return ZFS_ERR_BAD_FS; + } + + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) { + printf("invalid chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(le->le_hash, endian) != h) + continue; + + if (zap_leaf_array_equal(l, endian, blksft, + grub_zfs_to_cpu16(le->le_name_chunk, endian), + grub_zfs_to_cpu16(le->le_name_length, endian), + name)) { + struct zap_leaf_array *la; + + if (le->le_int_size != 8 || le->le_value_length != 1) { + printf("invalid leaf chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + + *value = grub_be_to_cpu64(la->la_array64); + + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + + +/* Verify if this is a fat zap header block */ +static int +zap_verify(zap_phys_t *zap) +{ + if (zap->zap_magic != (uint64_t) ZAP_MAGIC) { + printf("bad ZAP magic\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_flags != 0) { + printf("bad ZAP flags\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_salt == 0) { + printf("bad ZAP salt\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Fat ZAP lookup + * + */ +/* XXX */ +static int +fzap_lookup(dnode_end_t *zap_dnode, zap_phys_t *zap, + char *name, uint64_t *value, struct grub_zfs_data *data) +{ + void *l; + uint64_t hash, idx, blkid; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t leafendian; + + err = zap_verify(zap); + if (err) + return err; + + hash = zap_hash(zap->zap_salt, name); + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + idx = ZAP_HASH_IDX(hash, zap->zap_ptrtbl.zt_shift); + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return ZFS_ERR_BAD_FS; + } + err = dmu_read(zap_dnode, blkid, &l, &leafendian, data); + if (err) + return err; + + err = zap_leaf_lookup(l, leafendian, blksft, hash, name, value); + free(l); + return err; +} + +/* XXX */ +static int +fzap_iterate(dnode_end_t *zap_dnode, zap_phys_t *zap, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + zap_leaf_phys_t *l; + void *l_in; + uint64_t idx, blkid; + uint16_t chunk; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t endian; + + if (zap_verify(zap)) + return 0; + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return 0; + } + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return 0; + } + for (idx = 0; idx < zap->zap_ptrtbl.zt_numblks; idx++) { + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + err = dmu_read(zap_dnode, blkid, &l_in, &endian, data); + l = l_in; + if (err) + continue; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + free(l); + continue; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + free(l); + continue; + } + + for (chunk = 0; chunk < ZAP_LEAF_NUMCHUNKS(blksft); chunk++) { + char *buf; + struct zap_leaf_array *la; + struct zap_leaf_entry *le; + uint64_t val; + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) + continue; + + buf = malloc(grub_zfs_to_cpu16(le->le_name_length, endian) + + 1); + if (zap_leaf_array_get(l, endian, blksft, le->le_name_chunk, + le->le_name_length, buf)) { + free(buf); + continue; + } + buf[le->le_name_length] = 0; + + if (le->le_int_size != 8 + || grub_zfs_to_cpu16(le->le_value_length, endian) != 1) + continue; + + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + val = grub_be_to_cpu64(la->la_array64); + if (hook(buf, val, data)) + return 1; + free(buf); + } + } + return 0; +} + + +/* + * Read in the data of a zap object and find the value for a matching + * property name. + * + */ +static int +zap_lookup(dnode_end_t *zap_dnode, char *name, uint64_t *val, + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return err; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + err = (mzap_lookup(zapbuf, endian, size, name, val)); + free(zapbuf); + return err; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + err = (fzap_lookup(zap_dnode, zapbuf, name, val, data)); + free(zapbuf); + return err; + } + + printf("unknown ZAP type\n"); + return ZFS_ERR_BAD_FS; +} + +static int +zap_iterate(dnode_end_t *zap_dnode, + int (*hook)(const char *name, uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + int ret; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return 0; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + ret = mzap_iterate(zapbuf, endian, size, hook, data); + free(zapbuf); + return ret; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + ret = fzap_iterate(zap_dnode, zapbuf, hook, data); + free(zapbuf); + return ret; + } + printf("unknown ZAP type\n"); + return 0; +} + + +/* + * Get the dnode of an object number from the metadnode of an object set. + * + * Input + * mdn - metadnode to get the object dnode + * objnum - object number for the object dnode + * buf - data buffer that holds the returning dnode + */ +static int +dnode_get(dnode_end_t *mdn, uint64_t objnum, uint8_t type, + dnode_end_t *buf, struct grub_zfs_data *data) +{ + uint64_t blkid, blksz; /* the block id this object dnode is in */ + int epbs; /* shift of number of dnodes in a block */ + int idx; /* index within a block */ + void *dnbuf; + int err; + grub_zfs_endian_t endian; + + blksz = grub_zfs_to_cpu16(mdn->dn.dn_datablkszsec, + mdn->endian) << SPA_MINBLOCKSHIFT; + + epbs = zfs_log2(blksz) - DNODE_SHIFT; + blkid = objnum >> epbs; + idx = objnum & ((1 << epbs) - 1); + + if (data->dnode_buf != NULL && memcmp(data->dnode_mdn, mdn, + sizeof(*mdn)) == 0 + && objnum >= data->dnode_start && objnum < data->dnode_end) { + memmove(&(buf->dn), &(data->dnode_buf)[idx], DNODE_SIZE); + buf->endian = data->dnode_endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type: %02X != %02x\n", buf->dn.dn_type, type); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; + } + + err = dmu_read(mdn, blkid, &dnbuf, &endian, data); + if (err) + return err; + + free(data->dnode_buf); + free(data->dnode_mdn); + data->dnode_mdn = malloc(sizeof(*mdn)); + if (!data->dnode_mdn) { + data->dnode_buf = 0; + } else { + memcpy(data->dnode_mdn, mdn, sizeof(*mdn)); + data->dnode_buf = dnbuf; + data->dnode_start = blkid << epbs; + data->dnode_end = (blkid + 1) << epbs; + data->dnode_endian = endian; + } + + memmove(&(buf->dn), (dnode_phys_t *) dnbuf + idx, DNODE_SIZE); + buf->endian = endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Get the file dnode for a given file name where mdn is the meta dnode + * for this ZFS object set. When found, place the file dnode in dn. + * The 'path' argument will be mangled. + * + */ +static int +dnode_get_path(dnode_end_t *mdn, const char *path_in, dnode_end_t *dn, + struct grub_zfs_data *data) +{ + uint64_t objnum, version; + char *cname, ch; + int err = ZFS_ERR_NONE; + char *path, *path_buf; + struct dnode_chain { + struct dnode_chain *next; + dnode_end_t dn; + }; + struct dnode_chain *dnode_path = 0, *dn_new, *root; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) + return ZFS_ERR_OUT_OF_MEMORY; + dn_new->next = 0; + dnode_path = root = dn_new; + + err = dnode_get(mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + err = zap_lookup(&(dnode_path->dn), ZPL_VERSION_STR, &version, data); + if (err) { + free(dn_new); + return err; + } + if (version > ZPL_VERSION) { + free(dn_new); + printf("too new ZPL version\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + err = zap_lookup(&(dnode_path->dn), ZFS_ROOT_OBJ, &objnum, data); + if (err) { + free(dn_new); + return err; + } + + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + path = path_buf = strdup(path_in); + if (!path_buf) { + free(dn_new); + return ZFS_ERR_OUT_OF_MEMORY; + } + + while (1) { + /* skip leading slashes */ + while (*path == '/') + path++; + if (!*path) + break; + /* get the next component name */ + cname = path; + while (*path && *path != '/') + path++; + /* Skip dot. */ + if (cname + 1 == path && cname[0] == '.') + continue; + /* Handle double dot. */ + if (cname + 2 == path && cname[0] == '.' && cname[1] == '.') { + if (dn_new->next) { + dn_new = dnode_path; + dnode_path = dn_new->next; + free(dn_new); + } else { + printf("can't resolve ..\n"); + err = ZFS_ERR_FILE_NOT_FOUND; + break; + } + continue; + } + + ch = *path; + *path = 0; /* ensure null termination */ + + if (dnode_path->dn.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + free(path_buf); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + err = zap_lookup(&(dnode_path->dn), cname, &objnum, data); + if (err) + break; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + dn_new->next = dnode_path; + dnode_path = dn_new; + + objnum = ZFS_DIRENT_OBJ(objnum); + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) + break; + + *path = ch; + } + + if (!err) + memcpy(dn, &(dnode_path->dn), sizeof(*dn)); + + while (dnode_path) { + dn_new = dnode_path->next; + free(dnode_path); + dnode_path = dn_new; + } + free(path_buf); + return err; +} + + +/* + * Given a MOS metadnode, get the metadnode of a given filesystem name (fsname), + * e.g. pool/rootfs, or a given object number (obj), e.g. the object number + * of pool/rootfs. + * + * If no fsname and no obj are given, return the DSL_DIR metadnode. + * If fsname is given, return its metadnode and its matching object number. + * If only obj is given, return the metadnode for this object number. + * + */ +static int +get_filesystem_dnode(dnode_end_t *mosmdn, char *fsname, + dnode_end_t *mdn, struct grub_zfs_data *data) +{ + uint64_t objnum; + int err; + + err = dnode_get(mosmdn, DMU_POOL_DIRECTORY_OBJECT, + DMU_OT_OBJECT_DIRECTORY, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, DMU_POOL_ROOT_DATASET, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + while (*fsname) { + uint64_t childobj; + char *cname, ch; + + while (*fsname == '/') + fsname++; + + if (!*fsname || *fsname == '@') + break; + + cname = fsname; + while (*fsname && !isspace(*fsname) && *fsname != '/') + fsname++; + ch = *fsname; + *fsname = 0; + + childobj = grub_zfs_to_cpu64((((dsl_dir_phys_t *) DN_BONUS(&mdn->dn)))->dd_child_dir_zapobj, mdn->endian); + err = dnode_get(mosmdn, childobj, + DMU_OT_DSL_DIR_CHILD_MAP, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, cname, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + *fsname = ch; + } + return ZFS_ERR_NONE; +} + +static int +make_mdn(dnode_end_t *mdn, struct grub_zfs_data *data) +{ + void *osp; + blkptr_t *bp; + size_t ospsize; + int err; + + bp = &(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_bp); + err = zio_read(bp, mdn->endian, &osp, &ospsize, data); + if (err) + return err; + if (ospsize < OBJSET_PHYS_SIZE_V14) { + free(osp); + printf("too small osp\n"); + return ZFS_ERR_BAD_FS; + } + + mdn->endian = (grub_zfs_to_cpu64(bp->blk_prop, mdn->endian)>>63) & 1; + memmove((char *) &(mdn->dn), + (char *) &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + free(osp); + return ZFS_ERR_NONE; +} + +static int +dnode_get_fullpath(const char *fullpath, dnode_end_t *mdn, + uint64_t *mdnobj, dnode_end_t *dn, int *isfs, + struct grub_zfs_data *data) +{ + char *fsname, *snapname; + const char *ptr_at, *filename; + uint64_t headobj; + int err; + + ptr_at = strchr(fullpath, '@'); + if (!ptr_at) { + *isfs = 1; + filename = 0; + snapname = 0; + fsname = strdup(fullpath); + } else { + const char *ptr_slash = strchr(ptr_at, '/'); + + *isfs = 0; + fsname = malloc(ptr_at - fullpath + 1); + if (!fsname) + return ZFS_ERR_OUT_OF_MEMORY; + memcpy(fsname, fullpath, ptr_at - fullpath); + fsname[ptr_at - fullpath] = 0; + if (ptr_at[1] && ptr_at[1] != '/') { + snapname = malloc(ptr_slash - ptr_at); + if (!snapname) { + free(fsname); + return ZFS_ERR_OUT_OF_MEMORY; + } + memcpy(snapname, ptr_at + 1, ptr_slash - ptr_at - 1); + snapname[ptr_slash - ptr_at - 1] = 0; + } else { + snapname = 0; + } + if (ptr_slash) + filename = ptr_slash; + else + filename = "/"; + printf("zfs fsname = '%s' snapname='%s' filename = '%s'\n", + fsname, snapname, filename); + } + + + err = get_filesystem_dnode(&(data->mos), fsname, dn, data); + + if (err) { + free(fsname); + free(snapname); + return err; + } + + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&dn->dn))->dd_head_dataset_obj, dn->endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + + if (snapname) { + uint64_t snapobj; + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_snapnames_zapobj, mdn->endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, mdn, data); + if (!err) + err = zap_lookup(mdn, snapname, &headobj, data); + if (!err) + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + } + + if (mdnobj) + *mdnobj = headobj; + + make_mdn(mdn, data); + + if (*isfs) { + free(fsname); + free(snapname); + return ZFS_ERR_NONE; + } + err = dnode_get_path(mdn, filename, dn, data); + free(fsname); + free(snapname); + return err; +} + +/* + * For a given XDR packed nvlist, verify the first 4 bytes and move on. + * + * An XDR packed nvlist is encoded as (comments from nvs_xdr_create) : + * + * encoding method/host endian (4 bytes) + * nvl_version (4 bytes) + * nvl_nvflag (4 bytes) + * encoded nvpairs: + * encoded size of the nvpair (4 bytes) + * decoded size of the nvpair (4 bytes) + * name string size (4 bytes) + * name string data (sizeof(NV_ALIGN4(string)) + * data type (4 bytes) + * # of elements in the nvpair (4 bytes) + * data + * 2 zero's for the last nvpair + * (end of the entire list) (8 bytes) + * + */ + +static int +nvlist_find_value(char *nvlist, char *name, int valtype, char **val, + size_t *size_out, size_t *nelm_out) +{ + int name_len, type, encode_size; + char *nvpair, *nvp_name; + + /* Verify if the 1st and 2nd byte in the nvlist are valid. */ + /* NOTE: independently of what endianness header announces all + subsequent values are big-endian. */ + if (nvlist[0] != NV_ENCODE_XDR || (nvlist[1] != NV_LITTLE_ENDIAN + && nvlist[1] != NV_BIG_ENDIAN)) { + printf("zfs incorrect nvlist header\n"); + return ZFS_ERR_BAD_FS; + } + + /* skip the header, nvl_version, and nvl_nvflag */ + nvlist = nvlist + 4 * 3; + /* + * Loop thru the nvpair list + * The XDR representation of an integer is in big-endian byte order. + */ + while ((encode_size = grub_be_to_cpu32(*(uint32_t *) nvlist))) { + int nelm; + + nvpair = nvlist + 4 * 2; /* skip the encode/decode size */ + + name_len = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nvp_name = nvpair; + nvpair = nvpair + ((name_len + 3) & ~3); /* align */ + + type = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nelm = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (nelm < 1) { + printf("empty nvpair\n"); + return ZFS_ERR_BAD_FS; + } + + nvpair += 4; + + if ((strncmp(nvp_name, name, name_len) == 0) && type == valtype) { + *val = nvpair; + *size_out = encode_size; + if (nelm_out) + *nelm_out = nelm; + return 1; + } + + nvlist += encode_size; /* goto the next nvpair */ + } + return 0; +} + +int +grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out) +{ + char *nvpair; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0); + if (!found) + return 0; + if (size < sizeof(uint64_t)) { + printf("invalid uint64\n"); + return ZFS_ERR_BAD_FS; + } + + *out = grub_be_to_cpu64(*(uint64_t *) nvpair); + return 1; +} + +char * +grub_zfs_nvlist_lookup_string(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t slen; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_STRING, &nvpair, &size, 0); + if (!found) + return 0; + if (size < 4) { + printf("invalid string\n"); + return 0; + } + slen = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (slen > size - 4) + slen = size - 4; + ret = malloc(slen + 1); + if (!ret) + return 0; + memcpy(ret, nvpair + 4, slen); + ret[slen] = 0; + return ret; +} + +char * +grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, 0); + if (!found) + return 0; + ret = calloc(1, size + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpair, size); + return ret; +} + +int +grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name) +{ + char *nvpair; + size_t nelm, size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return -1; + return nelm; +} + +char * +grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index) +{ + char *nvpair, *nvpairptr; + int found; + char *ret; + size_t size; + unsigned i; + size_t nelm; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return 0; + if (index >= nelm) { + printf("trying to lookup past nvlist array\n"); + return 0; + } + + nvpairptr = nvpair; + + for (i = 0; i < index; i++) { + uint32_t encode_size; + + /* skip the header, nvl_version, and nvl_nvflag */ + nvpairptr = nvpairptr + 4 * 2; + + while (nvpairptr < nvpair + size + && (encode_size = grub_be_to_cpu32(*(uint32_t *) nvpairptr))) + nvlist += encode_size; /* goto the next nvpair */ + + nvlist = nvlist + 4 * 2; /* skip the ending 2 zeros - 8 bytes */ + } + + if (nvpairptr >= nvpair + size + || nvpairptr + grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + >= nvpair + size) { + printf("incorrect nvlist array\n"); + return 0; + } + + ret = calloc(1, grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpairptr, size); + return ret; +} + +static int +zfs_fetch_nvlist(struct grub_zfs_data *data, char **nvlist) +{ + int err; + + *nvlist = malloc(VDEV_PHYS_SIZE); + /* Read in the vdev name-value pair list (112K). */ + err = zfs_devread(data->vdev_phys_sector, 0, VDEV_PHYS_SIZE, *nvlist); + if (err) { + free(*nvlist); + *nvlist = 0; + return err; + } + return ZFS_ERR_NONE; +} + +/* + * Check the disk label information and retrieve needed vdev name-value pairs. + * + */ +static int +check_pool_label(struct grub_zfs_data *data) +{ + uint64_t pool_state; + char *nvlist; /* for the pool */ + char *vdevnvlist; /* for the vdev */ + uint64_t diskguid; + uint64_t version; + int found; + int err; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) + return err; + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_STATE, + &pool_state); + if (!found) { + free(nvlist); + printf("zfs pool state not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (pool_state == POOL_STATE_DESTROYED) { + free(nvlist); + printf("zpool is marked as destroyed\n"); + return ZFS_ERR_BAD_FS; + } + + data->label_txg = 0; + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_TXG, + &data->label_txg); + if (!found) { + free(nvlist); + printf("zfs pool txg not found\n"); + return ZFS_ERR_BAD_FS; + } + + /* not an active device */ + if (data->label_txg == 0) { + free(nvlist); + printf("zpool is not active\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_VERSION, + &version); + if (!found) { + free(nvlist); + printf("zpool config version not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (version > SPA_VERSION) { + free(nvlist); + printf("SPA version too new %llu > %llu\n", + (unsigned long long) version, + (unsigned long long) SPA_VERSION); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + vdevnvlist = grub_zfs_nvlist_lookup_nvlist(nvlist, ZPOOL_CONFIG_VDEV_TREE); + if (!vdevnvlist) { + free(nvlist); + printf("ZFS config vdev tree not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT, + &data->vdev_ashift); + free(vdevnvlist); + if (!found) { + free(nvlist); + printf("ZPOOL config ashift not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_GUID, &diskguid); + if (!found) { + free(nvlist); + printf("ZPOOL config guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_GUID, &data->pool_guid); + if (!found) { + free(nvlist); + printf("ZPOOL config pool guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + free(nvlist); + + printf("ZFS Pool GUID: %llu (%016llx) Label: GUID: %llu (%016llx), txg: %llu, SPA v%llu, ashift: %llu\n", + (unsigned long long) data->pool_guid, + (unsigned long long) data->pool_guid, + (unsigned long long) diskguid, + (unsigned long long) diskguid, + (unsigned long long) data->label_txg, + (unsigned long long) version, + (unsigned long long) data->vdev_ashift); + + return ZFS_ERR_NONE; +} + +/* + * vdev_label_start returns the physical disk offset (in bytes) of + * label "l". + */ +static uint64_t vdev_label_start(uint64_t psize, int l) +{ + return (l * sizeof(vdev_label_t) + (l < VDEV_LABELS / 2 ? + 0 : psize - + VDEV_LABELS * sizeof(vdev_label_t))); +} + +void +zfs_unmount(struct grub_zfs_data *data) +{ + free(data->dnode_buf); + free(data->dnode_mdn); + free(data->file_buf); + free(data); +} + +/* + * zfs_mount() locates a valid uberblock of the root pool and read in its MOS + * to the memory address MOS. + * + */ +struct grub_zfs_data * +zfs_mount(device_t dev) +{ + struct grub_zfs_data *data = 0; + int label = 0, bestlabel = -1; + char *ub_array; + uberblock_t *ubbest; + uberblock_t *ubcur = NULL; + void *osp = 0; + size_t ospsize; + int err; + + data = malloc(sizeof(*data)); + if (!data) + return 0; + memset(data, 0, sizeof(*data)); + + ub_array = malloc(VDEV_UBERBLOCK_RING); + if (!ub_array) { + zfs_unmount(data); + return 0; + } + + ubbest = malloc(sizeof(*ubbest)); + if (!ubbest) { + zfs_unmount(data); + return 0; + } + memset(ubbest, 0, sizeof(*ubbest)); + + /* + * some eltorito stacks don't give us a size and + * we end up setting the size to MAXUINT, further + * some of these devices stop working once a single + * read past the end has been issued. Checking + * for a maximum part_length and skipping the backup + * labels at the end of the slice/partition/device + * avoids breaking down on such devices. + */ + const int vdevnum = + dev->part_length == 0 ? + VDEV_LABELS / 2 : VDEV_LABELS; + + /* Size in bytes of the device (disk or partition) aligned to label size*/ + uint64_t device_size = + dev->part_length << SECTOR_BITS; + + const uint64_t alignedbytes = + P2ALIGN(device_size, (uint64_t) sizeof(vdev_label_t)); + + for (label = 0; label < vdevnum; label++) { + uint64_t labelstartbytes = vdev_label_start(alignedbytes, label); + uint64_t labelstart = labelstartbytes >> SECTOR_BITS; + + debug("zfs reading label %d at sector %llu (byte %llu)\n", + label, (unsigned long long) labelstart, + (unsigned long long) labelstartbytes); + + data->vdev_phys_sector = labelstart + + ((VDEV_SKIP_SIZE + VDEV_BOOT_HEADER_SIZE) >> SECTOR_BITS); + + err = check_pool_label(data); + if (err) { + printf("zfs error checking label %d\n", label); + continue; + } + + /* Read in the uberblock ring (128K). */ + err = zfs_devread(data->vdev_phys_sector + + (VDEV_PHYS_SIZE >> SECTOR_BITS), + 0, VDEV_UBERBLOCK_RING, ub_array); + if (err) { + printf("zfs error reading uberblock ring for label %d\n", label); + continue; + } + + ubcur = find_bestub(ub_array, data); + if (!ubcur) { + printf("zfs No good uberblocks found in label %d\n", label); + continue; + } + + if (vdev_uberblock_compare(ubcur, ubbest) > 0) { + /* Looks like the block is good, so use it.*/ + memcpy(ubbest, ubcur, sizeof(*ubbest)); + bestlabel = label; + debug("zfs Current best uberblock found in label %d\n", label); + } + } + free(ub_array); + + /* We zero'd the structure to begin with. If we never assigned to it, + magic will still be zero. */ + if (!ubbest->ub_magic) { + printf("couldn't find a valid ZFS label\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + debug("zfs ubbest %p in label %d\n", ubbest, bestlabel); + + grub_zfs_endian_t ub_endian = + grub_zfs_to_cpu64(ubbest->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + ? LITTLE_ENDIAN : BIG_ENDIAN; + + debug("zfs endian set to %s\n", !ub_endian ? "big" : "little"); + + err = zio_read(&ubbest->ub_rootbp, ub_endian, &osp, &ospsize, data); + + if (err) { + printf("couldn't zio_read object directory\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + if (ospsize < OBJSET_PHYS_SIZE_V14) { + printf("osp too small\n"); + zfs_unmount(data); + free(osp); + free(ubbest); + return 0; + } + + /* Got the MOS. Save it at the memory addr MOS. */ + memmove(&(data->mos.dn), &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + data->mos.endian = + (grub_zfs_to_cpu64(ubbest->ub_rootbp.blk_prop, ub_endian) >> 63) & 1; + memmove(&(data->current_uberblock), ubbest, sizeof(uberblock_t)); + + free(osp); + free(ubbest); + + return data; +} + +int +grub_zfs_fetch_nvlist(device_t dev, char **nvlist) +{ + struct grub_zfs_data *zfs; + int err; + + zfs = zfs_mount(dev); + if (!zfs) + return ZFS_ERR_BAD_FS; + err = zfs_fetch_nvlist(zfs, nvlist); + zfs_unmount(zfs); + return err; +} + +static int +zfs_label(device_t device, char **label) +{ + char *nvlist; + int err; + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) { + zfs_unmount(data); + return err; + } + + *label = grub_zfs_nvlist_lookup_string(nvlist, ZPOOL_CONFIG_POOL_NAME); + free(nvlist); + zfs_unmount(data); + return ZFS_ERR_NONE; +} + +static int +zfs_uuid(device_t device, char **uuid) +{ + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + *uuid = malloc(17); /* %016llx + nil */ + if (!*uuid) + return ZFS_ERR_OUT_OF_MEMORY; + + /* *uuid = xasprintf ("%016llx", (long long unsigned) data->pool_guid);*/ + snprintf(*uuid, 17, "%016llx", (long long unsigned) data->pool_guid); + zfs_unmount(data); + + return ZFS_ERR_NONE; +} + +/* + * zfs_open() locates a file in the rootpool by following the + * MOS and places the dnode of the file in the memory address DNODE. + */ +int +zfs_open(struct zfs_file *file, const char *fsfilename) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(file->device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), 0, + &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + + if (isfs) { + zfs_unmount(data); + printf("Missing @ or / separator\n"); + return ZFS_ERR_FILE_NOT_FOUND; + } + + /* We found the dnode for this file. Verify if it is a plain file. */ + if (data->dnode.dn.dn_type != DMU_OT_PLAIN_FILE_CONTENTS) { + zfs_unmount(data); + printf("not a file\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + + /* get the file size and set the file position to 0 */ + + /* + * For DMU_OT_SA we will need to locate the SIZE attribute + * attribute, which could be either in the bonus buffer + * or the "spill" block. + */ + if (data->dnode.dn.dn_bonustype == DMU_OT_SA) { + void *sahdrp; + int hdrsize; + + if (data->dnode.dn.dn_bonuslen != 0) { + sahdrp = (sa_hdr_phys_t *) DN_BONUS(&data->dnode.dn); + } else if (data->dnode.dn.dn_flags & DNODE_FLAG_SPILL_BLKPTR) { + blkptr_t *bp = &data->dnode.dn.dn_spill; + + err = zio_read(bp, data->dnode.endian, &sahdrp, NULL, data); + if (err) + return err; + } else { + printf("filesystem is corrupt :(\n"); + return ZFS_ERR_BAD_FS; + } + + hdrsize = SA_HDR_SIZE(((sa_hdr_phys_t *) sahdrp)); + file->size = *(uint64_t *) ((char *) sahdrp + hdrsize + SA_SIZE_OFFSET); + } else { + file->size = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&data->dnode.dn))->zp_size, data->dnode.endian); + } + + file->data = data; + file->offset = 0; + + return ZFS_ERR_NONE; +} + +uint64_t +zfs_read(zfs_file_t file, char *buf, uint64_t len) +{ + struct grub_zfs_data *data = (struct grub_zfs_data *) file->data; + int blksz, movesize; + uint64_t length; + int64_t red; + int err; + + if (data->file_buf == NULL) { + data->file_buf = malloc(SPA_MAXBLOCKSIZE); + if (!data->file_buf) + return -1; + data->file_start = data->file_end = 0; + } + + /* + * If offset is in memory, move it into the buffer provided and return. + */ + if (file->offset >= data->file_start + && file->offset + len <= data->file_end) { + memmove(buf, data->file_buf + file->offset - data->file_start, + len); + return len; + } + + blksz = grub_zfs_to_cpu16(data->dnode.dn.dn_datablkszsec, + data->dnode.endian) << SPA_MINBLOCKSHIFT; + + /* + * Entire Dnode is too big to fit into the space available. We + * will need to read it in chunks. This could be optimized to + * read in as large a chunk as there is space available, but for + * now, this only reads in one data block at a time. + */ + length = len; + red = 0; + while (length) { + void *t; + /* + * Find requested blkid and the offset within that block. + */ + uint64_t blkid = (file->offset + red) / blksz; + free(data->file_buf); + data->file_buf = 0; + + err = dmu_read(&(data->dnode), blkid, &t, + 0, data); + data->file_buf = t; + if (err) + return -1; + + data->file_start = blkid * blksz; + data->file_end = data->file_start + blksz; + + movesize = MIN(length, data->file_end - (int) file->offset - red); + + memmove(buf, data->file_buf + file->offset + red + - data->file_start, movesize); + buf += movesize; + length -= movesize; + red += movesize; + } + + return len; +} + +int +zfs_close(zfs_file_t file) +{ + zfs_unmount((struct grub_zfs_data *) file->data); + return ZFS_ERR_NONE; +} + +int +grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(dev); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), mdnobj, + &(data->dnode), &isfs, data); + zfs_unmount(data); + return err; +} + +static void +fill_fs_info(struct zfs_dirhook_info *info, + dnode_end_t mdn, struct grub_zfs_data *data) +{ + int err; + dnode_end_t dn; + uint64_t objnum; + uint64_t headobj; + + memset(info, 0, sizeof(*info)); + + info->dir = 1; + + if (mdn.dn.dn_type == DMU_OT_DSL_DIR) { + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&mdn.dn))->dd_head_dataset_obj, mdn.endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &mdn, data); + if (err) { + printf("zfs failed here 1\n"); + return; + } + } + make_mdn(&mdn, data); + err = dnode_get(&mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &dn, data); + if (err) { + printf("zfs failed here 2\n"); + return; + } + + err = zap_lookup(&dn, ZFS_ROOT_OBJ, &objnum, data); + if (err) { + printf("zfs failed here 3\n"); + return; + } + + err = dnode_get(&mdn, objnum, 0, &dn, data); + if (err) { + printf("zfs failed here 4\n"); + return; + } + + info->mtimeset = 1; + info->mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + + return; +} + +static int iterate_zap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t dn; + + memset(&info, 0, sizeof(info)); + + dnode_get(&(data->mdn), val, 0, &dn, data); + info.mtimeset = 1; + info.mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + info.dir = (dn.dn.dn_type == DMU_OT_DIRECTORY_CONTENTS); + debug("zfs type=%d, name=%s\n", + (int)dn.dn.dn_type, (char *)name); + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_fs(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t mdn; + int err; + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + if (mdn.dn.dn_type != DMU_OT_DSL_DIR) + return 0; + + fill_fs_info(&info, mdn, data); + + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_snap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + char *name2; + int ret = 0; + dnode_end_t mdn; + int err; + + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + + if (mdn.dn.dn_type != DMU_OT_DSL_DATASET) + return 0; + + fill_fs_info(&info, mdn, data); + + name2 = malloc(strlen(name) + 2); + name2[0] = '@'; + memcpy(name2 + 1, name, strlen(name) + 1); + if (data->userhook) + ret = data->userhook(name2, &info); + free(name2); + return ret; +} + +int +zfs_ls(device_t device, const char *path, + int (*hook)(const char *, const struct zfs_dirhook_info *)) +{ + struct grub_zfs_data *data; + int err; + int isfs; +#if 0 + char *label = NULL; + + zfs_label(device, &label); + if (label) + printf("ZPOOL label '%s'\n", + label); +#endif + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + data->userhook = hook; + + err = dnode_get_fullpath(path, &(data->mdn), 0, &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + if (isfs) { + uint64_t childobj, headobj; + uint64_t snapobj; + dnode_end_t dn; + struct zfs_dirhook_info info; + + fill_fs_info(&info, data->dnode, data); + hook("@", &info); + + childobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_child_dir_zapobj, data->dnode.endian); + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_head_dataset_obj, data->dnode.endian); + err = dnode_get(&(data->mos), childobj, + DMU_OT_DSL_DIR_CHILD_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + + zap_iterate(&dn, iterate_zap_fs, data); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&dn.dn))->ds_snapnames_zapobj, dn.endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + zap_iterate(&dn, iterate_zap_snap, data); + } else { + if (data->dnode.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + zfs_unmount(data); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + zap_iterate(&(data->dnode), iterate_zap, data); + } + zfs_unmount(data); + return ZFS_ERR_NONE; +} + diff --git a/fs/zfs/zfs_fletcher.c b/fs/zfs/zfs_fletcher.c new file mode 100644 index 0000000..d96c6ff --- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,84 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +void +fletcher_2(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint64_t *ip = buf; + const uint64_t *ipend = ip + (size / sizeof(uint64_t)); + uint64_t a0, b0, a1, b1; + + for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 += grub_zfs_to_cpu64(ip[0], endian); + a1 += grub_zfs_to_cpu64(ip[1], endian); + b0 += a0; + b1 += a1; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a0, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(a1, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(b0, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(b1, endian); +} + +void +fletcher_4(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint32_t *ip = buf; + const uint32_t *ipend = ip + (size / sizeof(uint32_t)); + uint64_t a, b, c, d; + + for (a = b = c = d = 0; ip < ipend; ip++) { + a += grub_zfs_to_cpu32(ip[0], endian); + b += a; + c += b; + d += c; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(b, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(c, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(d, endian); +} + diff --git a/fs/zfs/zfs_lzjb.c b/fs/zfs/zfs_lzjb.c new file mode 100644 index 0000000..33e9b90 --- /dev/null +++ b/fs/zfs/zfs_lzjb.c @@ -0,0 +1,94 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +#define MATCH_BITS 6 +#define MATCH_MIN 3 +#define OFFSET_MASK ((1 << (16 - MATCH_BITS)) - 1) + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + +int +lzjb_decompress(void *s_start, void *d_start, uint32_t s_len, + uint32_t d_len) +{ + uint8_t *src = s_start; + uint8_t *dst = d_start; + uint8_t *d_end = (uint8_t *) d_start + d_len; + uint8_t *s_end = (uint8_t *) s_start + s_len; + uint8_t *cpy, copymap = 0; + int copymask = 1 << (NBBY - 1); + + while (dst < d_end && src < s_end) { + if ((copymask <<= 1) == (1 << NBBY)) { + copymask = 1; + copymap = *src++; + } + if (src >= s_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + if (copymap & copymask) { + int mlen = (src[0] >> (NBBY - MATCH_BITS)) + MATCH_MIN; + int offset = ((src[0] << NBBY) | src[1]) & OFFSET_MASK; + src += 2; + cpy = dst - offset; + if (src > s_end || cpy < (uint8_t *) d_start) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + while (--mlen >= 0 && dst < d_end) + *dst++ = *cpy++; + } else { + *dst++ = *src++; + } + } + if (dst < d_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; +} diff --git a/fs/zfs/zfs_sha256.c b/fs/zfs/zfs_sha256.c new file mode 100644 index 0000000..7a9439a --- /dev/null +++ b/fs/zfs/zfs_sha256.c @@ -0,0 +1,145 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +/* + * SHA-256 checksum, as specified in FIPS 180-2, available at: + * http://csrc.nist.gov/cryptval + * + * This is a very compact implementation of SHA-256. + * It is designed to be simple and portable, not to be fast. + */ + +/* + * The literal definitions according to FIPS180-2 would be: + * + * Ch(x, y, z) (((x) & (y)) ^ ((~(x)) & (z))) + * Maj(x, y, z) (((x) & (y)) | ((x) & (z)) | ((y) & (z))) + * + * We use logical equivalents which require one less op. + */ +#define Ch(x, y, z) ((z) ^ ((x) & ((y) ^ (z)))) +#define Maj(x, y, z) (((x) & (y)) ^ ((z) & ((x) ^ (y)))) +#define Rot32(x, s) (((x) >> s) | ((x) << (32 - s))) +#define SIGMA0(x) (Rot32(x, 2) ^ Rot32(x, 13) ^ Rot32(x, 22)) +#define SIGMA1(x) (Rot32(x, 6) ^ Rot32(x, 11) ^ Rot32(x, 25)) +#define sigma0(x) (Rot32(x, 7) ^ Rot32(x, 18) ^ ((x) >> 3)) +#define sigma1(x) (Rot32(x, 17) ^ Rot32(x, 19) ^ ((x) >> 10)) + +static const uint32_t SHA256_K[64] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +}; + +static void +SHA256Transform(uint32_t *H, const uint8_t *cp) +{ + uint32_t a, b, c, d, e, f, g, h, t, T1, T2, W[64]; + + for (t = 0; t < 16; t++, cp += 4) + W[t] = (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | cp[3]; + + for (t = 16; t < 64; t++) + W[t] = sigma1(W[t - 2]) + W[t - 7] + + sigma0(W[t - 15]) + W[t - 16]; + + a = H[0]; b = H[1]; c = H[2]; d = H[3]; + e = H[4]; f = H[5]; g = H[6]; h = H[7]; + + for (t = 0; t < 64; t++) { + T1 = h + SIGMA1(e) + Ch(e, f, g) + SHA256_K[t] + W[t]; + T2 = SIGMA0(a) + Maj(a, b, c); + h = g; g = f; f = e; e = d + T1; + d = c; c = b; b = a; a = T1 + T2; + } + + H[0] += a; H[1] += b; H[2] += c; H[3] += d; + H[4] += e; H[5] += f; H[6] += g; H[7] += h; +} + +void +zio_checksum_SHA256(const void *buf, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp) +{ + uint32_t H[8] = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, + 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 }; + uint8_t pad[128]; + unsigned padsize = size & 63; + unsigned i; + + for (i = 0; i < size - padsize; i += 64) + SHA256Transform(H, (uint8_t *)buf + i); + + for (i = 0; i < padsize; i++) + pad[i] = ((uint8_t *)buf)[i]; + + for (pad[padsize++] = 0x80; (padsize & 63) != 56; padsize++) + pad[padsize] = 0; + + for (i = 0; i < 8; i++) + pad[padsize++] = (size << 3) >> (56 - 8 * i); + + for (i = 0; i < padsize; i += 64) + SHA256Transform(H, pad + i); + + zcp->zc_word[0] = grub_cpu_to_zfs64((uint64_t)H[0] << 32 | H[1], + endian); + zcp->zc_word[1] = grub_cpu_to_zfs64((uint64_t)H[2] << 32 | H[3], + endian); + zcp->zc_word[2] = grub_cpu_to_zfs64((uint64_t)H[4] << 32 | H[5], + endian); + zcp->zc_word[3] = grub_cpu_to_zfs64((uint64_t)H[6] << 32 | H[7], + endian); +} diff --git a/include/config_cmd_all.h b/include/config_cmd_all.h index 55f4f7a..5933ae9 100644 --- a/include/config_cmd_all.h +++ b/include/config_cmd_all.h @@ -36,6 +36,7 @@ #define CONFIG_CMD_ELF /* ELF (VxWorks) load/boot cmd */ #define CONFIG_CMD_EXT2 /* EXT2 Support */ #define CONFIG_CMD_FAT /* FAT support */ +#define CONFIG_CMD_ZFS /* ZFS support */ #define CONFIG_CMD_FDC /* Floppy Disk Support */ #define CONFIG_CMD_FDOS /* Floppy DOS support */ #define CONFIG_CMD_FLASH /* flinfo, erase, protect */ diff --git a/include/zfs_common.h b/include/zfs_common.h new file mode 100644 index 0000000..969dbf5 --- /dev/null +++ b/include/zfs_common.h @@ -0,0 +1,94 @@ +/* + * ZFS filesystem implementation in Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __ZFS_COMMON__ +#define __ZFS_COMMON__ + +#define SECTOR_SIZE 0x200 +#define SECTOR_BITS 9 + +#define grub_le_to_cpu16 le16_to_cpu +#define grub_be_to_cpu16 be16_to_cpu +#define grub_le_to_cpu32 le32_to_cpu +#define grub_be_to_cpu32 be32_to_cpu +#define grub_le_to_cpu64 le64_to_cpu +#define grub_be_to_cpu64 be64_to_cpu + +#define grub_cpu_to_le64 cpu_to_le64 +#define grub_cpu_to_be64 cpu_to_be64 + +enum zfs_errors { + ZFS_ERR_NONE = 0, + ZFS_ERR_NOT_IMPLEMENTED_YET = -1, + ZFS_ERR_BAD_FS = -2, + ZFS_ERR_OUT_OF_MEMORY = -3, + ZFS_ERR_FILE_NOT_FOUND = -4, + ZFS_ERR_BAD_FILE_TYPE = -5, + ZFS_ERR_OUT_OF_RANGE = -6, +}; + +struct zfs_filesystem { + + /* Block Device Descriptor */ + block_dev_desc_t *dev_desc; +}; + + +extern block_dev_desc_t *zfs_dev_desc; + +struct device_s { + uint64_t part_length; +}; +typedef struct device_s *device_t; + +struct zfs_file { + device_t device; + uint64_t size; + void *data; + uint64_t offset; +}; + +typedef struct zfs_file *zfs_file_t; + +struct zfs_dirhook_info { + int dir; + int mtimeset; + time_t mtime; + time_t mtime2; +}; + + + + +struct zfs_filesystem *zfsget_fs(void); +int init_fs(block_dev_desc_t *dev_desc); +void deinit_fs(block_dev_desc_t *dev_desc); +int zfs_open(zfs_file_t, const char *filename); +uint64_t zfs_read(zfs_file_t, char *buf, uint64_t len); +struct grub_zfs_data *zfs_mount(device_t); +int zfs_close(zfs_file_t); +int zfs_ls(device_t dev, const char *path, + int (*hook) (const char *, const struct zfs_dirhook_info *)); +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf); +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part); +void zfs_unmount(struct grub_zfs_data *data); +int lzjb_decompress(void *, void *, uint32_t, uint32_t); +#endif

Hi Jorgen,
On Wed, May 23, 2012 at 1:45 PM, Jorgen Lundman lundman@lundman.net wrote:
Signed-off-by: Jorgen Lundman lundman@lundman.net
Makefile         |   2 +-
You just committed a major ML no-no
Always out a revision summary below the --- (even if it's a 'no change')
Regards,
Graeme

Patch to add ZFS filesystem support to u-boot, based on GRUB sources. Thank you for your patience.
Jorgen Lundman (1): Add ZFS filesystem support
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 29 files changed, 4687 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h

U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Released by Sun for GRUB under the license: * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version.
GRUB official releases include ZFS in version: ftp://alpha.gnu.org/gnu/grub/grub-1.99~rc1.tar.gz
And patched against GRUB Bazaar repository for ashift fixes (4KB HDDs) more conveniently found at github: https://github.com/pendor/grub-zfs/commit/e7b6ef3ac3b9685ac4c394c897b1d4221b...
Signed-off-by: Jorgen Lundman lundman@lundman.net
---
v3: * add missing patch revision history (this text) * Submitted as single patch per Wolfgang Denk instructions
v2: * Keep Makefile placement alphabetically sorted. * Clean ugly line breaks and indentation errors * Fix license corruption in fs/Makefile --- Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 29 files changed, 4687 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 351a8f0..d3b84bf 100644 --- a/Makefile +++ b/Makefile @@ -244,7 +244,7 @@ endif LIBS += arch/$(ARCH)/lib/lib$(ARCH).o LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \ fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \ - fs/ubifs/libubifs.o + fs/ubifs/libubifs.o fs/zfs/libzfs.o LIBS += net/libnet.o LIBS += disk/libdisk.o LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 6e23baa..4de03da 100644 --- a/common/Makefile +++ b/common/Makefile @@ -164,6 +164,7 @@ COBJS-$(CONFIG_USB_STORAGE) += usb_storage.o endif COBJS-$(CONFIG_CMD_XIMG) += cmd_ximg.o COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
# others diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..a6ea2c0 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,236 @@ +/* + * + * ZFS filesystem porting to Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + * + */ + +#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h> + +#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif + +#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif + +#define DOS_PART_MAGIC_OFFSET 0x1fe +#define DOS_FS_TYPE_OFFSET 0x36 +#define DOS_FS32_TYPE_OFFSET 0x52 + +static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + char *filename = NULL; + char *ep; + int dev; + unsigned long part = 1; + ulong addr = 0; + ulong part_length; + disk_partition_t info; + char buf[12]; + unsigned long count; + const char *addr_str; + struct zfs_file zfile; + struct device_s vdev; + + if (argc < 3) + return CMD_RET_USAGE; + + count = 0; + addr = simple_strtoul(argv[3], NULL, 16); + filename = getenv("bootfile"); + switch (argc) { + case 3: + addr_str = getenv("loadaddr"); + if (addr_str != NULL) + addr = simple_strtoul(addr_str, NULL, 16); + else + addr = CONFIG_SYS_LOAD_ADDR; + + break; + case 4: + break; + case 5: + filename = argv[4]; + break; + case 6: + filename = argv[4]; + count = simple_strtoul(argv[5], NULL, 16); + break; + + default: + return cmd_usage(cmdtp); + } + + if (!filename) { + puts("** No boot file defined **\n"); + return 1; + } + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + if (zfs_dev_desc == NULL) { + printf("** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (part != 0) { + if (get_partition_info(zfs_dev_desc, part, &info)) { + printf("** Bad partition %lu **\n", part); + return 1; + } + + if (strncmp((char *)info.type, BOOT_PART_TYPE, + strlen(BOOT_PART_TYPE)) != 0) { + printf("** Invalid partition type "%s" (expect "" BOOT_PART_TYPE "")\n", + info.type); + return 1; + } + printf("Loading file "%s" " + "from %s device %d:%lu %s\n", + filename, argv[1], dev, part, info.name); + } else { + printf("Loading file "%s" from %s device %d\n", + filename, argv[1], dev); + } + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("**Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + memset(&zfile, 0, sizeof(zfile)); + zfile.device = &vdev; + if (zfs_open(&zfile, filename)) { + printf("** File not found %s\n", filename); + return 1; + } + + if ((count < zfile.size) && (count != 0)) + zfile.size = (uint64_t)count; + + if (zfs_read(&zfile, (char *)addr, zfile.size) != zfile.size) { + printf("** Unable to read "%s" from %s %d:%lu **\n", + filename, argv[1], dev, part); + zfs_close(&zfile); + return 1; + } + + zfs_close(&zfile); + + /* Loading ok, update default load address */ + load_addr = addr; + + printf("%llu bytes read\n", zfile.size); + sprintf(buf, "%llX", zfile.size); + setenv("filesize", buf); + + return 0; +} + + +int zfs_print(const char *entry, const struct zfs_dirhook_info *data) +{ + printf("%s %s\n", + data->dir ? "<DIR> " : " ", + entry); + return 0; /* 0 continue, 1 stop */ +} + + + +static int do_zfs_ls(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + const char *filename = "/"; + int dev; + unsigned long part = 1; + char *ep; + int part_length; + struct device_s vdev; + + if (argc < 3) + return cmd_usage(cmdtp); + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + + if (zfs_dev_desc == NULL) { + printf("\n** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("\n** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (argc == 4) + filename = argv[3]; + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("** Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + zfs_ls(&vdev, filename, + zfs_print); + + return 0; +} + + +U_BOOT_CMD(zfsls, 4, 1, do_zfs_ls, + "list files in a directory (default /)", + "<interface> <dev[:part]> [directory]\n" + " - list files from 'dev' on 'interface' in a '/DATASET/@/$dir/'"); + +U_BOOT_CMD(zfsload, 6, 0, do_zfs_load, + "load binary file from a ZFS filesystem", + "<interface> <dev[:part]> [addr] [filename] [bytes]\n" + " - load binary file '/DATASET/@/$dir/$file' from 'dev' on 'interface'\n" + " to address 'addr' from ZFS filesystem"); diff --git a/fs/Makefile b/fs/Makefile index 22aad12..baa7e96 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -1,6 +1,6 @@ # # (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# Wolfgang Denk, DENX Software Engineering, <wd at denx.de> # # See file CREDITS for list of people who contributed to this # project. @@ -30,6 +30,7 @@ subdirs-$(CONFIG_CMD_JFFS2) += jffs2 subdirs-$(CONFIG_CMD_REISER) += reiserfs subdirs-$(CONFIG_YAFFS2) += yaffs2 subdirs-$(CONFIG_CMD_UBIFS) += ubifs +subdirs-$(CONFIG_CMD_ZFS) += zfs
SUBDIRS := $(subdirs-y)
diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 56% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..938fc5e 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,6 @@ # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2012 +# Jorgen Lundman <lundman at lundman.net> # # See file CREDITS for list of people who contributed to this # project. @@ -20,19 +20,28 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, # MA 02111-1307 USA # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk + +LIB = $(obj)libzfs.o + +AOBJS = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o + +SRCS := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS := $(addprefix $(obj),$(AOBJS) $(COBJS-y)) + + +all: $(LIB) $(AOBJS) + +$(LIB): $(obj).depend $(OBJS) + $(call cmd_link_o_target, $(OBJS)) + +######################################################################### + +# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all: - @for dir in $(SUBDIRS) ; do \ - $(MAKE) -C $$dir $@ ; done +######################################################################### diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..ab32865 --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,137 @@ +/* + * + * based on code of fs/reiserfs/dev.c by + * + * (C) Copyright 2003 - 2004 + * Sysgo AG, <www.elinos.com>, Pavel Bartusek pba@sysgo.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <config.h> +#include <zfs_common.h> + +static block_dev_desc_t *zfs_block_dev_desc; +static disk_partition_t part_info; + +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{ + zfs_block_dev_desc = rbdd; + + if (part == 0) { + /* disk doesn't use partition table */ + part_info.start = 0; + part_info.size = rbdd->lba; + part_info.blksz = rbdd->blksz; + } else { + if (get_partition_info(zfs_block_dev_desc, part, &part_info)) + return 0; + } + + return part_info.size; +} + +/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{ + short sec_buffer[SECTOR_SIZE/sizeof(short)]; + char *sec_buf = sec_buffer; + unsigned block_len; + + /* + * Check partition boundaries + */ + if ((sector < 0) || + ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >= + part_info.size)) { + /* errnum = ERR_OUTSIDE_PART; */ + printf(" ** zfs_devread() read outside partition sector %d\n", sector); + return 1; + } + + /* + * Get the read to the beginning of a partition. + */ + sector += byte_offset >> SECTOR_BITS; + byte_offset &= SECTOR_SIZE - 1; + + debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len); + + if (zfs_block_dev_desc == NULL) { + printf("** Invalid Block Device Descriptor (NULL)\n"); + return 1; + } + + if (byte_offset != 0) { + /* read first part which isn't aligned with start of sector */ + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error **\n"); + return 1; + } + memcpy(buf, sec_buf + byte_offset, + min(SECTOR_SIZE - byte_offset, byte_len)); + buf += min(SECTOR_SIZE - byte_offset, byte_len); + byte_len -= min(SECTOR_SIZE - byte_offset, byte_len); + sector++; + } + + if (byte_len == 0) + return 0; + + /* read sector aligned part */ + block_len = byte_len & ~(SECTOR_SIZE - 1); + + if (block_len == 0) { + u8 p[SECTOR_SIZE]; + + block_len = SECTOR_SIZE; + zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + 1, (unsigned long *)p); + memcpy(buf, p, byte_len); + return 0; + } + + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + block_len / SECTOR_SIZE, + (unsigned long *) buf) != + block_len / SECTOR_SIZE) { + printf(" ** zfs_devread() read error - block\n"); + return 1; + } + + block_len = byte_len & ~(SECTOR_SIZE - 1); + buf += block_len; + byte_len -= block_len; + sector += block_len / SECTOR_SIZE; + + if (byte_len != 0) { + /* read rest of data which are not in whole sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error - last part\n"); + return 1; + } + memcpy(buf, sec_buf, byte_len); + } + return 0; +} diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..d6e0e23 --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2396 @@ +/* + * + * ZFS filesystem ported to u-boot by + * Jorgen Lundman <lundman at lundman.net> + * + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * Copyright 2004 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + * + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +block_dev_desc_t *zfs_dev_desc; + +/* + * The zfs plug-in routines for GRUB are: + * + * zfs_mount() - locates a valid uberblock of the root pool and reads + * in its MOS at the memory address MOS. + * + * zfs_open() - locates a plain file object by following the MOS + * and places its dnode at the memory address DNODE. + * + * zfs_read() - read in the data blocks pointed by the DNODE. + * + */ + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/sa_impl.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + + +#define ZPOOL_PROP_BOOTFS "bootfs" + + +/* + * For nvlist manipulation. (from nvpair.h) + */ +#define NV_ENCODE_NATIVE 0 +#define NV_ENCODE_XDR 1 +#define NV_BIG_ENDIAN 0 +#define NV_LITTLE_ENDIAN 1 +#define DATA_TYPE_UINT64 8 +#define DATA_TYPE_STRING 9 +#define DATA_TYPE_NVLIST 19 +#define DATA_TYPE_NVLIST_ARRAY 20 + + +/* + * Macros to get fields in a bp or DVA. + */ +#define P2PHASE(x, align) ((x) & ((align) - 1)) +#define DVA_OFFSET_TO_PHYS_SECTOR(offset) \ + ((offset + VDEV_LABEL_START_SIZE) >> SPA_MINBLOCKSHIFT) + +/* + * return x rounded down to an align boundary + * eg, P2ALIGN(1200, 1024) == 1024 (1*align) + * eg, P2ALIGN(1024, 1024) == 1024 (1*align) + * eg, P2ALIGN(0x1234, 0x100) == 0x1200 (0x12*align) + * eg, P2ALIGN(0x5600, 0x100) == 0x5600 (0x56*align) + */ +#define P2ALIGN(x, align) ((x) & -(align)) + +/* + * FAT ZAP data structures + */ +#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ +#define ZAP_HASH_IDX(hash, n) (((n) == 0) ? 0 : ((hash) >> (64 - (n)))) +#define CHAIN_END 0xffff /* end of the chunk chain */ + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +#define ZAP_LEAF_HASH_SHIFT(bs) (bs - 5) +#define ZAP_LEAF_HASH_NUMENTRIES(bs) (1 << ZAP_LEAF_HASH_SHIFT(bs)) +#define LEAF_HASH(bs, h) \ + ((ZAP_LEAF_HASH_NUMENTRIES(bs)-1) & \ + ((h) >> (64 - ZAP_LEAF_HASH_SHIFT(bs)-l->l_hdr.lh_prefix_len))) + +/* + * The amount of space available for chunks is: + * block size shift - hash entry size (2) * number of hash + * entries - header space (2*chunksize) + */ +#define ZAP_LEAF_NUMCHUNKS(bs) \ + (((1<<bs) - 2*ZAP_LEAF_HASH_NUMENTRIES(bs)) / \ + ZAP_LEAF_CHUNKSIZE - 2) + +/* + * The chunks start immediately after the hash table. The end of the + * hash table is at l_hash + HASH_NUMENTRIES, which we simply cast to a + * chunk_t. + */ +#define ZAP_LEAF_CHUNK(l, bs, idx) \ + ((zap_leaf_chunk_t *)(l->l_hash + ZAP_LEAF_HASH_NUMENTRIES(bs)))[idx] +#define ZAP_LEAF_ENTRY(l, bs, idx) (&ZAP_LEAF_CHUNK(l, bs, idx).l_entry) + + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + + + +typedef int zfs_decomp_func_t(void *s_start, void *d_start, + uint32_t s_len, uint32_t d_len); +typedef struct decomp_entry { + char *name; + zfs_decomp_func_t *decomp_func; +} decomp_entry_t; + +typedef struct dnode_end { + dnode_phys_t dn; + grub_zfs_endian_t endian; +} dnode_end_t; + +struct grub_zfs_data { + /* cache for a file block of the currently zfs_open()-ed file */ + char *file_buf; + uint64_t file_start; + uint64_t file_end; + + /* XXX: ashift is per vdev, not per pool. We currently only ever touch + * a single vdev, but when/if raid-z or stripes are supported, this + * may need revision. + */ + uint64_t vdev_ashift; + uint64_t label_txg; + uint64_t pool_guid; + + /* cache for a dnode block */ + dnode_phys_t *dnode_buf; + dnode_phys_t *dnode_mdn; + uint64_t dnode_start; + uint64_t dnode_end; + grub_zfs_endian_t dnode_endian; + + uberblock_t current_uberblock; + + dnode_end_t mos; + dnode_end_t mdn; + dnode_end_t dnode; + + uint64_t vdev_phys_sector; + + int (*userhook)(const char *, const struct zfs_dirhook_info *); + struct zfs_dirhook_info *dirinfo; + +}; + + + + +static int +zlib_decompress(void *s, void *d, + uint32_t slen, uint32_t dlen) +{ + if (zlib_decompress(s, d, slen, dlen) < 0) + return ZFS_ERR_BAD_FS; + return ZFS_ERR_NONE; +} + +static decomp_entry_t decomp_table[ZIO_COMPRESS_FUNCTIONS] = { + {"inherit", NULL}, /* ZIO_COMPRESS_INHERIT */ + {"on", lzjb_decompress}, /* ZIO_COMPRESS_ON */ + {"off", NULL}, /* ZIO_COMPRESS_OFF */ + {"lzjb", lzjb_decompress}, /* ZIO_COMPRESS_LZJB */ + {"empty", NULL}, /* ZIO_COMPRESS_EMPTY */ + {"gzip-1", zlib_decompress}, /* ZIO_COMPRESS_GZIP1 */ + {"gzip-2", zlib_decompress}, /* ZIO_COMPRESS_GZIP2 */ + {"gzip-3", zlib_decompress}, /* ZIO_COMPRESS_GZIP3 */ + {"gzip-4", zlib_decompress}, /* ZIO_COMPRESS_GZIP4 */ + {"gzip-5", zlib_decompress}, /* ZIO_COMPRESS_GZIP5 */ + {"gzip-6", zlib_decompress}, /* ZIO_COMPRESS_GZIP6 */ + {"gzip-7", zlib_decompress}, /* ZIO_COMPRESS_GZIP7 */ + {"gzip-8", zlib_decompress}, /* ZIO_COMPRESS_GZIP8 */ + {"gzip-9", zlib_decompress}, /* ZIO_COMPRESS_GZIP9 */ +}; + + + +static int zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, + void *buf, struct grub_zfs_data *data); + +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data); + +/* + * Our own version of log2(). Same thing as highbit()-1. + */ +static int +zfs_log2(uint64_t num) +{ + int i = 0; + + while (num > 1) { + i++; + num = num >> 1; + } + + return i; +} + + +/* Checksum Functions */ +static void +zio_checksum_off(const void *buf __attribute__ ((unused)), + uint64_t size __attribute__ ((unused)), + grub_zfs_endian_t endian __attribute__ ((unused)), + zio_cksum_t *zcp) +{ + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +/* Checksum Table and Values */ +static zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = { + {NULL, 0, 0, "inherit"}, + {NULL, 0, 0, "on"}, + {zio_checksum_off, 0, 0, "off"}, + {zio_checksum_SHA256, 1, 1, "label"}, + {zio_checksum_SHA256, 1, 1, "gang_header"}, + {NULL, 0, 0, "zilog"}, + {fletcher_2, 0, 0, "fletcher2"}, + {fletcher_4, 1, 0, "fletcher4"}, + {zio_checksum_SHA256, 1, 0, "SHA256"}, + {NULL, 0, 0, "zilog2"}, +}; + +/* + * zio_checksum_verify: Provides support for checksum verification. + * + * Fletcher2, Fletcher4, and SHA256 are supported. + * + */ +static int +zio_checksum_verify(zio_cksum_t zc, uint32_t checksum, + grub_zfs_endian_t endian, char *buf, int size) +{ + zio_eck_t *zec = (zio_eck_t *) (buf + size) - 1; + zio_checksum_info_t *ci = &zio_checksum_table[checksum]; + zio_cksum_t actual_cksum, expected_cksum; + + if (checksum >= ZIO_CHECKSUM_FUNCTIONS || ci->ci_func == NULL) { + printf("zfs unknown checksum function %d\n", checksum); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (ci->ci_eck) { + expected_cksum = zec->zec_cksum; + zec->zec_cksum = zc; + ci->ci_func(buf, size, endian, &actual_cksum); + zec->zec_cksum = expected_cksum; + zc = expected_cksum; + } else { + ci->ci_func(buf, size, endian, &actual_cksum); + } + + if ((actual_cksum.zc_word[0] != zc.zc_word[0]) + || (actual_cksum.zc_word[1] != zc.zc_word[1]) + || (actual_cksum.zc_word[2] != zc.zc_word[2]) + || (actual_cksum.zc_word[3] != zc.zc_word[3])) { + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * vdev_uberblock_compare takes two uberblock structures and returns an integer + * indicating the more recent of the two. + * Return Value = 1 if ub2 is more recent + * Return Value = -1 if ub1 is more recent + * The most recent uberblock is determined using its transaction number and + * timestamp. The uberblock with the highest transaction number is + * considered "newer". If the transaction numbers of the two blocks match, the + * timestamps are compared to determine the "newer" of the two. + */ +static int +vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +{ + grub_zfs_endian_t ub1_endian, ub2_endian; + if (grub_zfs_to_cpu64(ub1->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub1_endian = LITTLE_ENDIAN; + else + ub1_endian = BIG_ENDIAN; + if (grub_zfs_to_cpu64(ub2->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub2_endian = LITTLE_ENDIAN; + else + ub2_endian = BIG_ENDIAN; + + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return 1; + + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return 1; + + return 0; +} + +/* + * Three pieces of information are needed to verify an uberblock: the magic + * number, the version number, and the checksum. + * + * Currently Implemented: version number, magic number, label txg + * Need to Implement: checksum + * + */ +static int +uberblock_verify(uberblock_t *uber, int offset, struct grub_zfs_data *data) +{ + int err; + grub_zfs_endian_t endian = UNKNOWN_ENDIAN; + zio_cksum_t zc; + + if (uber->ub_txg < data->label_txg) { + debug("ignoring partially written label: uber_txg < label_txg %llu %llu\n", + uber->ub_txg, data->label_txg); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(uber->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) <= SPA_VERSION) + endian = LITTLE_ENDIAN; + + if (grub_zfs_to_cpu64(uber->ub_magic, BIG_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) <= SPA_VERSION) + endian = BIG_ENDIAN; + + if (endian == UNKNOWN_ENDIAN) { + printf("invalid uberblock magic\n"); + return ZFS_ERR_BAD_FS; + } + + memset(&zc, 0, sizeof(zc)); + zc.zc_word[0] = grub_cpu_to_zfs64(offset, endian); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_LABEL, endian, + (char *) uber, UBERBLOCK_SIZE(data->vdev_ashift)); + + if (!err) { + /* Check that the data pointed by the rootbp is usable. */ + void *osp = NULL; + size_t ospsize; + err = zio_read(&uber->ub_rootbp, endian, &osp, &ospsize, data); + free(osp); + + if (!err && ospsize < OBJSET_PHYS_SIZE_V14) { + printf("uberblock rootbp points to invalid data\n"); + return ZFS_ERR_BAD_FS; + } + } + + return err; +} + +/* + * Find the best uberblock. + * Return: + * Success - Pointer to the best uberblock. + * Failure - NULL + */ +static uberblock_t *find_bestub(char *ub_array, struct grub_zfs_data *data) +{ + const uint64_t sector = data->vdev_phys_sector; + uberblock_t *ubbest = NULL; + uberblock_t *ubnext; + unsigned int i, offset, pickedub = 0; + int err = ZFS_ERR_NONE; + + const unsigned int UBCOUNT = UBERBLOCK_COUNT(data->vdev_ashift); + const uint64_t UBBYTES = UBERBLOCK_SIZE(data->vdev_ashift); + + for (i = 0; i < UBCOUNT; i++) { + ubnext = (uberblock_t *) (i * UBBYTES + ub_array); + offset = (sector << SPA_MINBLOCKSHIFT) + VDEV_PHYS_SIZE + (i * UBBYTES); + + err = uberblock_verify(ubnext, offset, data); + if (err) + continue; + + if (ubbest == NULL || vdev_uberblock_compare(ubnext, ubbest) > 0) { + ubbest = ubnext; + pickedub = i; + } + } + + if (ubbest) + debug("zfs Found best uberblock at idx %d, txg %llu\n", + pickedub, (unsigned long long) ubbest->ub_txg); + + return ubbest; +} + +static inline size_t +get_psize(blkptr_t *bp, grub_zfs_endian_t endian) +{ + return (((grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 16) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT; +} + +static uint64_t +dva_get_offset(dva_t *dva, grub_zfs_endian_t endian) +{ + return grub_zfs_to_cpu64((dva)->dva_word[1], + endian) << SPA_MINBLOCKSHIFT; +} + +/* + * Read a block of data based on the gang block address dva, + * and put its data in buf. + * + */ +static int +zio_read_gang(blkptr_t *bp, grub_zfs_endian_t endian, dva_t *dva, void *buf, + struct grub_zfs_data *data) +{ + zio_gbh_phys_t *zio_gb; + uint64_t offset, sector; + unsigned i; + int err; + zio_cksum_t zc; + + memset(&zc, 0, sizeof(zc)); + + zio_gb = malloc(SPA_GANGBLOCKSIZE); + if (!zio_gb) + return ZFS_ERR_OUT_OF_MEMORY; + + offset = dva_get_offset(dva, endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + /* read in the gang block header */ + err = zfs_devread(sector, 0, SPA_GANGBLOCKSIZE, (char *) zio_gb); + + if (err) { + free(zio_gb); + return err; + } + + /* XXX */ + /* self checksuming the gang block header */ + ZIO_SET_CHECKSUM(&zc, DVA_GET_VDEV(dva), + dva_get_offset(dva, endian), bp->blk_birth, 0); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_GANG_HEADER, endian, + (char *) zio_gb, SPA_GANGBLOCKSIZE); + if (err) { + free(zio_gb); + return err; + } + + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + + for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { + if (zio_gb->zg_blkptr[i].blk_birth == 0) + continue; + + err = zio_read_data(&zio_gb->zg_blkptr[i], endian, buf, data); + if (err) { + free(zio_gb); + return err; + } + buf = (char *) buf + get_psize(&zio_gb->zg_blkptr[i], endian); + } + free(zio_gb); + return ZFS_ERR_NONE; +} + +/* + * Read in a block of raw data to buf. + */ +static int +zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, void *buf, + struct grub_zfs_data *data) +{ + int i, psize; + int err = ZFS_ERR_NONE; + + psize = get_psize(bp, endian); + + /* pick a good dva from the block pointer */ + for (i = 0; i < SPA_DVAS_PER_BP; i++) { + uint64_t offset, sector; + + if (bp->blk_dva[i].dva_word[0] == 0 && bp->blk_dva[i].dva_word[1] == 0) + continue; + + if ((grub_zfs_to_cpu64(bp->blk_dva[i].dva_word[1], endian)>>63) & 1) { + err = zio_read_gang(bp, endian, &bp->blk_dva[i], buf, data); + } else { + /* read in a data block */ + offset = dva_get_offset(&bp->blk_dva[i], endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + err = zfs_devread(sector, 0, psize, buf); + } + + if (!err) { + /*Check the underlying checksum before we rule this DVA as "good"*/ + uint32_t checkalgo = (grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 40) & 0xff; + + err = zio_checksum_verify(bp->blk_cksum, checkalgo, endian, buf, psize); + if (!err) + return ZFS_ERR_NONE; + } + + /* If read failed or checksum bad, reset the error. Hopefully we've got some more DVA's to try.*/ + } + + if (!err) { + printf("couldn't find a valid DVA\n"); + err = ZFS_ERR_BAD_FS; + } + + return err; +} + +/* + * Read in a block of data, verify its checksum, decompress if needed, + * and put the uncompressed data in buf. + */ +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data) +{ + size_t lsize, psize; + unsigned int comp; + char *compbuf = NULL; + int err; + + *buf = NULL; + + comp = (grub_zfs_to_cpu64((bp)->blk_prop, endian)>>32) & 0xff; + lsize = (BP_IS_HOLE(bp) ? 0 : + (((grub_zfs_to_cpu64((bp)->blk_prop, endian) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT)); + psize = get_psize(bp, endian); + + if (size) + *size = lsize; + + if (comp >= ZIO_COMPRESS_FUNCTIONS) { + printf("compression algorithm %u not supported\n", (unsigned int) comp); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF && decomp_table[comp].decomp_func == NULL) { + printf("compression algorithm %s not supported\n", decomp_table[comp].name); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF) { + compbuf = malloc(psize); + if (!compbuf) + return ZFS_ERR_OUT_OF_MEMORY; + } else { + compbuf = *buf = malloc(lsize); + } + + err = zio_read_data(bp, endian, compbuf, data); + if (err) { + free(compbuf); + *buf = NULL; + return err; + } + + if (comp != ZIO_COMPRESS_OFF) { + *buf = malloc(lsize); + if (!*buf) { + free(compbuf); + return ZFS_ERR_OUT_OF_MEMORY; + } + + err = decomp_table[comp].decomp_func(compbuf, *buf, psize, lsize); + free(compbuf); + if (err) { + free(*buf); + *buf = NULL; + return err; + } + } + + return ZFS_ERR_NONE; +} + +/* + * Get the block from a block id. + * push the block onto the stack. + * + */ +static int +dmu_read(dnode_end_t *dn, uint64_t blkid, void **buf, + grub_zfs_endian_t *endian_out, struct grub_zfs_data *data) +{ + int idx, level; + blkptr_t *bp_array = dn->dn.dn_blkptr; + int epbs = dn->dn.dn_indblkshift - SPA_BLKPTRSHIFT; + blkptr_t *bp; + void *tmpbuf = 0; + grub_zfs_endian_t endian; + int err = ZFS_ERR_NONE; + + bp = malloc(sizeof(blkptr_t)); + if (!bp) + return ZFS_ERR_OUT_OF_MEMORY; + + endian = dn->endian; + for (level = dn->dn.dn_nlevels - 1; level >= 0; level--) { + idx = (blkid >> (epbs * level)) & ((1 << epbs) - 1); + *bp = bp_array[idx]; + if (bp_array != dn->dn.dn_blkptr) { + free(bp_array); + bp_array = 0; + } + + if (BP_IS_HOLE(bp)) { + size_t size = grub_zfs_to_cpu16(dn->dn.dn_datablkszsec, + dn->endian) + << SPA_MINBLOCKSHIFT; + *buf = malloc(size); + if (*buf) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + memset(*buf, 0, size); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + if (level == 0) { + err = zio_read(bp, endian, buf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + err = zio_read(bp, endian, &tmpbuf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + if (err) + break; + bp_array = tmpbuf; + } + if (bp_array != dn->dn.dn_blkptr) + free(bp_array); + if (endian_out) + *endian_out = endian; + + free(bp); + return err; +} + +/* + * mzap_lookup: Looks up property described by "name" and returns the value + * in "value". + */ +static int +mzap_lookup(mzap_phys_t *zapobj, grub_zfs_endian_t endian, + int objsize, char *name, uint64_t * value) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (strcmp(mzap_ent[i].mze_name, name) == 0) { + *value = grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian); + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + +static int +mzap_iterate(mzap_phys_t *zapobj, grub_zfs_endian_t endian, int objsize, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (hook(mzap_ent[i].mze_name, + grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian), + data)) + return 1; + } + + return 0; +} + +static uint64_t +zap_hash(uint64_t salt, const char *name) +{ + static uint64_t table[256]; + const uint8_t *cp; + uint8_t c; + uint64_t crc = salt; + + if (table[128] == 0) { + uint64_t *ct; + int i, j; + for (i = 0; i < 256; i++) { + for (ct = table + i, *ct = i, j = 8; j > 0; j--) + *ct = (*ct >> 1) ^ (-(*ct & 1) & ZFS_CRC64_POLY); + } + } + + for (cp = (const uint8_t *) name; (c = *cp) != '\0'; cp++) + crc = (crc >> 8) ^ table[(crc ^ c) & 0xFF]; + + /* + * Only use 28 bits, since we need 4 bits in the cookie for the + * collision differentiator. We MUST use the high bits, since + * those are the onces that we first pay attention to when + * chosing the bucket. + */ + crc &= ~((1ULL << (64 - ZAP_HASHBITS)) - 1); + + return crc; +} + +/* + * Only to be used on 8-bit arrays. + * array_len is actual len in bytes (not encoded le_value_length). + * buf is null-terminated. + */ +/* XXX */ +static int +zap_leaf_array_equal(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, int chunk, int array_len, const char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + return 0; + + if (memcmp(la->la_array, buf + bseen, toread) != 0) + break; + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return (bseen == array_len); +} + +/* XXX */ +static int +zap_leaf_array_get(zap_leaf_phys_t *l, grub_zfs_endian_t endian, int blksft, + int chunk, int array_len, char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + /* Don't use errno because this error is to be ignored. */ + return ZFS_ERR_BAD_FS; + + memcpy(buf + bseen, la->la_array, toread); + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return ZFS_ERR_NONE; +} + + +/* + * Given a zap_leaf_phys_t, walk thru the zap leaf chunks to get the + * value for the property "name". + * + */ +/* XXX */ +static int +zap_leaf_lookup(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, uint64_t h, + const char *name, uint64_t *value) +{ + uint16_t chunk; + struct zap_leaf_entry *le; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + printf("invalid leaf type\n"); + return ZFS_ERR_BAD_FS; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + printf("invalid leaf magic\n"); + return ZFS_ERR_BAD_FS; + } + + for (chunk = grub_zfs_to_cpu16(l->l_hash[LEAF_HASH(blksft, h)], endian); + chunk != CHAIN_END; chunk = le->le_next) { + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) { + printf("invalid chunk number\n"); + return ZFS_ERR_BAD_FS; + } + + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) { + printf("invalid chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(le->le_hash, endian) != h) + continue; + + if (zap_leaf_array_equal(l, endian, blksft, + grub_zfs_to_cpu16(le->le_name_chunk, endian), + grub_zfs_to_cpu16(le->le_name_length, endian), + name)) { + struct zap_leaf_array *la; + + if (le->le_int_size != 8 || le->le_value_length != 1) { + printf("invalid leaf chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + + *value = grub_be_to_cpu64(la->la_array64); + + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + + +/* Verify if this is a fat zap header block */ +static int +zap_verify(zap_phys_t *zap) +{ + if (zap->zap_magic != (uint64_t) ZAP_MAGIC) { + printf("bad ZAP magic\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_flags != 0) { + printf("bad ZAP flags\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_salt == 0) { + printf("bad ZAP salt\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Fat ZAP lookup + * + */ +/* XXX */ +static int +fzap_lookup(dnode_end_t *zap_dnode, zap_phys_t *zap, + char *name, uint64_t *value, struct grub_zfs_data *data) +{ + void *l; + uint64_t hash, idx, blkid; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t leafendian; + + err = zap_verify(zap); + if (err) + return err; + + hash = zap_hash(zap->zap_salt, name); + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + idx = ZAP_HASH_IDX(hash, zap->zap_ptrtbl.zt_shift); + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return ZFS_ERR_BAD_FS; + } + err = dmu_read(zap_dnode, blkid, &l, &leafendian, data); + if (err) + return err; + + err = zap_leaf_lookup(l, leafendian, blksft, hash, name, value); + free(l); + return err; +} + +/* XXX */ +static int +fzap_iterate(dnode_end_t *zap_dnode, zap_phys_t *zap, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + zap_leaf_phys_t *l; + void *l_in; + uint64_t idx, blkid; + uint16_t chunk; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t endian; + + if (zap_verify(zap)) + return 0; + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return 0; + } + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return 0; + } + for (idx = 0; idx < zap->zap_ptrtbl.zt_numblks; idx++) { + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + err = dmu_read(zap_dnode, blkid, &l_in, &endian, data); + l = l_in; + if (err) + continue; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + free(l); + continue; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + free(l); + continue; + } + + for (chunk = 0; chunk < ZAP_LEAF_NUMCHUNKS(blksft); chunk++) { + char *buf; + struct zap_leaf_array *la; + struct zap_leaf_entry *le; + uint64_t val; + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) + continue; + + buf = malloc(grub_zfs_to_cpu16(le->le_name_length, endian) + + 1); + if (zap_leaf_array_get(l, endian, blksft, le->le_name_chunk, + le->le_name_length, buf)) { + free(buf); + continue; + } + buf[le->le_name_length] = 0; + + if (le->le_int_size != 8 + || grub_zfs_to_cpu16(le->le_value_length, endian) != 1) + continue; + + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + val = grub_be_to_cpu64(la->la_array64); + if (hook(buf, val, data)) + return 1; + free(buf); + } + } + return 0; +} + + +/* + * Read in the data of a zap object and find the value for a matching + * property name. + * + */ +static int +zap_lookup(dnode_end_t *zap_dnode, char *name, uint64_t *val, + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return err; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + err = (mzap_lookup(zapbuf, endian, size, name, val)); + free(zapbuf); + return err; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + err = (fzap_lookup(zap_dnode, zapbuf, name, val, data)); + free(zapbuf); + return err; + } + + printf("unknown ZAP type\n"); + return ZFS_ERR_BAD_FS; +} + +static int +zap_iterate(dnode_end_t *zap_dnode, + int (*hook)(const char *name, uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + int ret; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return 0; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + ret = mzap_iterate(zapbuf, endian, size, hook, data); + free(zapbuf); + return ret; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + ret = fzap_iterate(zap_dnode, zapbuf, hook, data); + free(zapbuf); + return ret; + } + printf("unknown ZAP type\n"); + return 0; +} + + +/* + * Get the dnode of an object number from the metadnode of an object set. + * + * Input + * mdn - metadnode to get the object dnode + * objnum - object number for the object dnode + * buf - data buffer that holds the returning dnode + */ +static int +dnode_get(dnode_end_t *mdn, uint64_t objnum, uint8_t type, + dnode_end_t *buf, struct grub_zfs_data *data) +{ + uint64_t blkid, blksz; /* the block id this object dnode is in */ + int epbs; /* shift of number of dnodes in a block */ + int idx; /* index within a block */ + void *dnbuf; + int err; + grub_zfs_endian_t endian; + + blksz = grub_zfs_to_cpu16(mdn->dn.dn_datablkszsec, + mdn->endian) << SPA_MINBLOCKSHIFT; + + epbs = zfs_log2(blksz) - DNODE_SHIFT; + blkid = objnum >> epbs; + idx = objnum & ((1 << epbs) - 1); + + if (data->dnode_buf != NULL && memcmp(data->dnode_mdn, mdn, + sizeof(*mdn)) == 0 + && objnum >= data->dnode_start && objnum < data->dnode_end) { + memmove(&(buf->dn), &(data->dnode_buf)[idx], DNODE_SIZE); + buf->endian = data->dnode_endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type: %02X != %02x\n", buf->dn.dn_type, type); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; + } + + err = dmu_read(mdn, blkid, &dnbuf, &endian, data); + if (err) + return err; + + free(data->dnode_buf); + free(data->dnode_mdn); + data->dnode_mdn = malloc(sizeof(*mdn)); + if (!data->dnode_mdn) { + data->dnode_buf = 0; + } else { + memcpy(data->dnode_mdn, mdn, sizeof(*mdn)); + data->dnode_buf = dnbuf; + data->dnode_start = blkid << epbs; + data->dnode_end = (blkid + 1) << epbs; + data->dnode_endian = endian; + } + + memmove(&(buf->dn), (dnode_phys_t *) dnbuf + idx, DNODE_SIZE); + buf->endian = endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Get the file dnode for a given file name where mdn is the meta dnode + * for this ZFS object set. When found, place the file dnode in dn. + * The 'path' argument will be mangled. + * + */ +static int +dnode_get_path(dnode_end_t *mdn, const char *path_in, dnode_end_t *dn, + struct grub_zfs_data *data) +{ + uint64_t objnum, version; + char *cname, ch; + int err = ZFS_ERR_NONE; + char *path, *path_buf; + struct dnode_chain { + struct dnode_chain *next; + dnode_end_t dn; + }; + struct dnode_chain *dnode_path = 0, *dn_new, *root; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) + return ZFS_ERR_OUT_OF_MEMORY; + dn_new->next = 0; + dnode_path = root = dn_new; + + err = dnode_get(mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + err = zap_lookup(&(dnode_path->dn), ZPL_VERSION_STR, &version, data); + if (err) { + free(dn_new); + return err; + } + if (version > ZPL_VERSION) { + free(dn_new); + printf("too new ZPL version\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + err = zap_lookup(&(dnode_path->dn), ZFS_ROOT_OBJ, &objnum, data); + if (err) { + free(dn_new); + return err; + } + + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + path = path_buf = strdup(path_in); + if (!path_buf) { + free(dn_new); + return ZFS_ERR_OUT_OF_MEMORY; + } + + while (1) { + /* skip leading slashes */ + while (*path == '/') + path++; + if (!*path) + break; + /* get the next component name */ + cname = path; + while (*path && *path != '/') + path++; + /* Skip dot. */ + if (cname + 1 == path && cname[0] == '.') + continue; + /* Handle double dot. */ + if (cname + 2 == path && cname[0] == '.' && cname[1] == '.') { + if (dn_new->next) { + dn_new = dnode_path; + dnode_path = dn_new->next; + free(dn_new); + } else { + printf("can't resolve ..\n"); + err = ZFS_ERR_FILE_NOT_FOUND; + break; + } + continue; + } + + ch = *path; + *path = 0; /* ensure null termination */ + + if (dnode_path->dn.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + free(path_buf); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + err = zap_lookup(&(dnode_path->dn), cname, &objnum, data); + if (err) + break; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + dn_new->next = dnode_path; + dnode_path = dn_new; + + objnum = ZFS_DIRENT_OBJ(objnum); + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) + break; + + *path = ch; + } + + if (!err) + memcpy(dn, &(dnode_path->dn), sizeof(*dn)); + + while (dnode_path) { + dn_new = dnode_path->next; + free(dnode_path); + dnode_path = dn_new; + } + free(path_buf); + return err; +} + + +/* + * Given a MOS metadnode, get the metadnode of a given filesystem name (fsname), + * e.g. pool/rootfs, or a given object number (obj), e.g. the object number + * of pool/rootfs. + * + * If no fsname and no obj are given, return the DSL_DIR metadnode. + * If fsname is given, return its metadnode and its matching object number. + * If only obj is given, return the metadnode for this object number. + * + */ +static int +get_filesystem_dnode(dnode_end_t *mosmdn, char *fsname, + dnode_end_t *mdn, struct grub_zfs_data *data) +{ + uint64_t objnum; + int err; + + err = dnode_get(mosmdn, DMU_POOL_DIRECTORY_OBJECT, + DMU_OT_OBJECT_DIRECTORY, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, DMU_POOL_ROOT_DATASET, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + while (*fsname) { + uint64_t childobj; + char *cname, ch; + + while (*fsname == '/') + fsname++; + + if (!*fsname || *fsname == '@') + break; + + cname = fsname; + while (*fsname && !isspace(*fsname) && *fsname != '/') + fsname++; + ch = *fsname; + *fsname = 0; + + childobj = grub_zfs_to_cpu64((((dsl_dir_phys_t *) DN_BONUS(&mdn->dn)))->dd_child_dir_zapobj, mdn->endian); + err = dnode_get(mosmdn, childobj, + DMU_OT_DSL_DIR_CHILD_MAP, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, cname, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + *fsname = ch; + } + return ZFS_ERR_NONE; +} + +static int +make_mdn(dnode_end_t *mdn, struct grub_zfs_data *data) +{ + void *osp; + blkptr_t *bp; + size_t ospsize; + int err; + + bp = &(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_bp); + err = zio_read(bp, mdn->endian, &osp, &ospsize, data); + if (err) + return err; + if (ospsize < OBJSET_PHYS_SIZE_V14) { + free(osp); + printf("too small osp\n"); + return ZFS_ERR_BAD_FS; + } + + mdn->endian = (grub_zfs_to_cpu64(bp->blk_prop, mdn->endian)>>63) & 1; + memmove((char *) &(mdn->dn), + (char *) &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + free(osp); + return ZFS_ERR_NONE; +} + +static int +dnode_get_fullpath(const char *fullpath, dnode_end_t *mdn, + uint64_t *mdnobj, dnode_end_t *dn, int *isfs, + struct grub_zfs_data *data) +{ + char *fsname, *snapname; + const char *ptr_at, *filename; + uint64_t headobj; + int err; + + ptr_at = strchr(fullpath, '@'); + if (!ptr_at) { + *isfs = 1; + filename = 0; + snapname = 0; + fsname = strdup(fullpath); + } else { + const char *ptr_slash = strchr(ptr_at, '/'); + + *isfs = 0; + fsname = malloc(ptr_at - fullpath + 1); + if (!fsname) + return ZFS_ERR_OUT_OF_MEMORY; + memcpy(fsname, fullpath, ptr_at - fullpath); + fsname[ptr_at - fullpath] = 0; + if (ptr_at[1] && ptr_at[1] != '/') { + snapname = malloc(ptr_slash - ptr_at); + if (!snapname) { + free(fsname); + return ZFS_ERR_OUT_OF_MEMORY; + } + memcpy(snapname, ptr_at + 1, ptr_slash - ptr_at - 1); + snapname[ptr_slash - ptr_at - 1] = 0; + } else { + snapname = 0; + } + if (ptr_slash) + filename = ptr_slash; + else + filename = "/"; + printf("zfs fsname = '%s' snapname='%s' filename = '%s'\n", + fsname, snapname, filename); + } + + + err = get_filesystem_dnode(&(data->mos), fsname, dn, data); + + if (err) { + free(fsname); + free(snapname); + return err; + } + + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&dn->dn))->dd_head_dataset_obj, dn->endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + + if (snapname) { + uint64_t snapobj; + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_snapnames_zapobj, mdn->endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, mdn, data); + if (!err) + err = zap_lookup(mdn, snapname, &headobj, data); + if (!err) + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + } + + if (mdnobj) + *mdnobj = headobj; + + make_mdn(mdn, data); + + if (*isfs) { + free(fsname); + free(snapname); + return ZFS_ERR_NONE; + } + err = dnode_get_path(mdn, filename, dn, data); + free(fsname); + free(snapname); + return err; +} + +/* + * For a given XDR packed nvlist, verify the first 4 bytes and move on. + * + * An XDR packed nvlist is encoded as (comments from nvs_xdr_create) : + * + * encoding method/host endian (4 bytes) + * nvl_version (4 bytes) + * nvl_nvflag (4 bytes) + * encoded nvpairs: + * encoded size of the nvpair (4 bytes) + * decoded size of the nvpair (4 bytes) + * name string size (4 bytes) + * name string data (sizeof(NV_ALIGN4(string)) + * data type (4 bytes) + * # of elements in the nvpair (4 bytes) + * data + * 2 zero's for the last nvpair + * (end of the entire list) (8 bytes) + * + */ + +static int +nvlist_find_value(char *nvlist, char *name, int valtype, char **val, + size_t *size_out, size_t *nelm_out) +{ + int name_len, type, encode_size; + char *nvpair, *nvp_name; + + /* Verify if the 1st and 2nd byte in the nvlist are valid. */ + /* NOTE: independently of what endianness header announces all + subsequent values are big-endian. */ + if (nvlist[0] != NV_ENCODE_XDR || (nvlist[1] != NV_LITTLE_ENDIAN + && nvlist[1] != NV_BIG_ENDIAN)) { + printf("zfs incorrect nvlist header\n"); + return ZFS_ERR_BAD_FS; + } + + /* skip the header, nvl_version, and nvl_nvflag */ + nvlist = nvlist + 4 * 3; + /* + * Loop thru the nvpair list + * The XDR representation of an integer is in big-endian byte order. + */ + while ((encode_size = grub_be_to_cpu32(*(uint32_t *) nvlist))) { + int nelm; + + nvpair = nvlist + 4 * 2; /* skip the encode/decode size */ + + name_len = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nvp_name = nvpair; + nvpair = nvpair + ((name_len + 3) & ~3); /* align */ + + type = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nelm = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (nelm < 1) { + printf("empty nvpair\n"); + return ZFS_ERR_BAD_FS; + } + + nvpair += 4; + + if ((strncmp(nvp_name, name, name_len) == 0) && type == valtype) { + *val = nvpair; + *size_out = encode_size; + if (nelm_out) + *nelm_out = nelm; + return 1; + } + + nvlist += encode_size; /* goto the next nvpair */ + } + return 0; +} + +int +grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out) +{ + char *nvpair; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0); + if (!found) + return 0; + if (size < sizeof(uint64_t)) { + printf("invalid uint64\n"); + return ZFS_ERR_BAD_FS; + } + + *out = grub_be_to_cpu64(*(uint64_t *) nvpair); + return 1; +} + +char * +grub_zfs_nvlist_lookup_string(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t slen; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_STRING, &nvpair, &size, 0); + if (!found) + return 0; + if (size < 4) { + printf("invalid string\n"); + return 0; + } + slen = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (slen > size - 4) + slen = size - 4; + ret = malloc(slen + 1); + if (!ret) + return 0; + memcpy(ret, nvpair + 4, slen); + ret[slen] = 0; + return ret; +} + +char * +grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, 0); + if (!found) + return 0; + ret = calloc(1, size + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpair, size); + return ret; +} + +int +grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name) +{ + char *nvpair; + size_t nelm, size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return -1; + return nelm; +} + +char * +grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index) +{ + char *nvpair, *nvpairptr; + int found; + char *ret; + size_t size; + unsigned i; + size_t nelm; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return 0; + if (index >= nelm) { + printf("trying to lookup past nvlist array\n"); + return 0; + } + + nvpairptr = nvpair; + + for (i = 0; i < index; i++) { + uint32_t encode_size; + + /* skip the header, nvl_version, and nvl_nvflag */ + nvpairptr = nvpairptr + 4 * 2; + + while (nvpairptr < nvpair + size + && (encode_size = grub_be_to_cpu32(*(uint32_t *) nvpairptr))) + nvlist += encode_size; /* goto the next nvpair */ + + nvlist = nvlist + 4 * 2; /* skip the ending 2 zeros - 8 bytes */ + } + + if (nvpairptr >= nvpair + size + || nvpairptr + grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + >= nvpair + size) { + printf("incorrect nvlist array\n"); + return 0; + } + + ret = calloc(1, grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpairptr, size); + return ret; +} + +static int +zfs_fetch_nvlist(struct grub_zfs_data *data, char **nvlist) +{ + int err; + + *nvlist = malloc(VDEV_PHYS_SIZE); + /* Read in the vdev name-value pair list (112K). */ + err = zfs_devread(data->vdev_phys_sector, 0, VDEV_PHYS_SIZE, *nvlist); + if (err) { + free(*nvlist); + *nvlist = 0; + return err; + } + return ZFS_ERR_NONE; +} + +/* + * Check the disk label information and retrieve needed vdev name-value pairs. + * + */ +static int +check_pool_label(struct grub_zfs_data *data) +{ + uint64_t pool_state; + char *nvlist; /* for the pool */ + char *vdevnvlist; /* for the vdev */ + uint64_t diskguid; + uint64_t version; + int found; + int err; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) + return err; + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_STATE, + &pool_state); + if (!found) { + free(nvlist); + printf("zfs pool state not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (pool_state == POOL_STATE_DESTROYED) { + free(nvlist); + printf("zpool is marked as destroyed\n"); + return ZFS_ERR_BAD_FS; + } + + data->label_txg = 0; + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_TXG, + &data->label_txg); + if (!found) { + free(nvlist); + printf("zfs pool txg not found\n"); + return ZFS_ERR_BAD_FS; + } + + /* not an active device */ + if (data->label_txg == 0) { + free(nvlist); + printf("zpool is not active\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_VERSION, + &version); + if (!found) { + free(nvlist); + printf("zpool config version not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (version > SPA_VERSION) { + free(nvlist); + printf("SPA version too new %llu > %llu\n", + (unsigned long long) version, + (unsigned long long) SPA_VERSION); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + vdevnvlist = grub_zfs_nvlist_lookup_nvlist(nvlist, ZPOOL_CONFIG_VDEV_TREE); + if (!vdevnvlist) { + free(nvlist); + printf("ZFS config vdev tree not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT, + &data->vdev_ashift); + free(vdevnvlist); + if (!found) { + free(nvlist); + printf("ZPOOL config ashift not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_GUID, &diskguid); + if (!found) { + free(nvlist); + printf("ZPOOL config guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_GUID, &data->pool_guid); + if (!found) { + free(nvlist); + printf("ZPOOL config pool guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + free(nvlist); + + printf("ZFS Pool GUID: %llu (%016llx) Label: GUID: %llu (%016llx), txg: %llu, SPA v%llu, ashift: %llu\n", + (unsigned long long) data->pool_guid, + (unsigned long long) data->pool_guid, + (unsigned long long) diskguid, + (unsigned long long) diskguid, + (unsigned long long) data->label_txg, + (unsigned long long) version, + (unsigned long long) data->vdev_ashift); + + return ZFS_ERR_NONE; +} + +/* + * vdev_label_start returns the physical disk offset (in bytes) of + * label "l". + */ +static uint64_t vdev_label_start(uint64_t psize, int l) +{ + return (l * sizeof(vdev_label_t) + (l < VDEV_LABELS / 2 ? + 0 : psize - + VDEV_LABELS * sizeof(vdev_label_t))); +} + +void +zfs_unmount(struct grub_zfs_data *data) +{ + free(data->dnode_buf); + free(data->dnode_mdn); + free(data->file_buf); + free(data); +} + +/* + * zfs_mount() locates a valid uberblock of the root pool and read in its MOS + * to the memory address MOS. + * + */ +struct grub_zfs_data * +zfs_mount(device_t dev) +{ + struct grub_zfs_data *data = 0; + int label = 0, bestlabel = -1; + char *ub_array; + uberblock_t *ubbest; + uberblock_t *ubcur = NULL; + void *osp = 0; + size_t ospsize; + int err; + + data = malloc(sizeof(*data)); + if (!data) + return 0; + memset(data, 0, sizeof(*data)); + + ub_array = malloc(VDEV_UBERBLOCK_RING); + if (!ub_array) { + zfs_unmount(data); + return 0; + } + + ubbest = malloc(sizeof(*ubbest)); + if (!ubbest) { + zfs_unmount(data); + return 0; + } + memset(ubbest, 0, sizeof(*ubbest)); + + /* + * some eltorito stacks don't give us a size and + * we end up setting the size to MAXUINT, further + * some of these devices stop working once a single + * read past the end has been issued. Checking + * for a maximum part_length and skipping the backup + * labels at the end of the slice/partition/device + * avoids breaking down on such devices. + */ + const int vdevnum = + dev->part_length == 0 ? + VDEV_LABELS / 2 : VDEV_LABELS; + + /* Size in bytes of the device (disk or partition) aligned to label size*/ + uint64_t device_size = + dev->part_length << SECTOR_BITS; + + const uint64_t alignedbytes = + P2ALIGN(device_size, (uint64_t) sizeof(vdev_label_t)); + + for (label = 0; label < vdevnum; label++) { + uint64_t labelstartbytes = vdev_label_start(alignedbytes, label); + uint64_t labelstart = labelstartbytes >> SECTOR_BITS; + + debug("zfs reading label %d at sector %llu (byte %llu)\n", + label, (unsigned long long) labelstart, + (unsigned long long) labelstartbytes); + + data->vdev_phys_sector = labelstart + + ((VDEV_SKIP_SIZE + VDEV_BOOT_HEADER_SIZE) >> SECTOR_BITS); + + err = check_pool_label(data); + if (err) { + printf("zfs error checking label %d\n", label); + continue; + } + + /* Read in the uberblock ring (128K). */ + err = zfs_devread(data->vdev_phys_sector + + (VDEV_PHYS_SIZE >> SECTOR_BITS), + 0, VDEV_UBERBLOCK_RING, ub_array); + if (err) { + printf("zfs error reading uberblock ring for label %d\n", label); + continue; + } + + ubcur = find_bestub(ub_array, data); + if (!ubcur) { + printf("zfs No good uberblocks found in label %d\n", label); + continue; + } + + if (vdev_uberblock_compare(ubcur, ubbest) > 0) { + /* Looks like the block is good, so use it.*/ + memcpy(ubbest, ubcur, sizeof(*ubbest)); + bestlabel = label; + debug("zfs Current best uberblock found in label %d\n", label); + } + } + free(ub_array); + + /* We zero'd the structure to begin with. If we never assigned to it, + magic will still be zero. */ + if (!ubbest->ub_magic) { + printf("couldn't find a valid ZFS label\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + debug("zfs ubbest %p in label %d\n", ubbest, bestlabel); + + grub_zfs_endian_t ub_endian = + grub_zfs_to_cpu64(ubbest->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + ? LITTLE_ENDIAN : BIG_ENDIAN; + + debug("zfs endian set to %s\n", !ub_endian ? "big" : "little"); + + err = zio_read(&ubbest->ub_rootbp, ub_endian, &osp, &ospsize, data); + + if (err) { + printf("couldn't zio_read object directory\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + if (ospsize < OBJSET_PHYS_SIZE_V14) { + printf("osp too small\n"); + zfs_unmount(data); + free(osp); + free(ubbest); + return 0; + } + + /* Got the MOS. Save it at the memory addr MOS. */ + memmove(&(data->mos.dn), &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + data->mos.endian = + (grub_zfs_to_cpu64(ubbest->ub_rootbp.blk_prop, ub_endian) >> 63) & 1; + memmove(&(data->current_uberblock), ubbest, sizeof(uberblock_t)); + + free(osp); + free(ubbest); + + return data; +} + +int +grub_zfs_fetch_nvlist(device_t dev, char **nvlist) +{ + struct grub_zfs_data *zfs; + int err; + + zfs = zfs_mount(dev); + if (!zfs) + return ZFS_ERR_BAD_FS; + err = zfs_fetch_nvlist(zfs, nvlist); + zfs_unmount(zfs); + return err; +} + +static int +zfs_label(device_t device, char **label) +{ + char *nvlist; + int err; + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) { + zfs_unmount(data); + return err; + } + + *label = grub_zfs_nvlist_lookup_string(nvlist, ZPOOL_CONFIG_POOL_NAME); + free(nvlist); + zfs_unmount(data); + return ZFS_ERR_NONE; +} + +static int +zfs_uuid(device_t device, char **uuid) +{ + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + *uuid = malloc(17); /* %016llx + nil */ + if (!*uuid) + return ZFS_ERR_OUT_OF_MEMORY; + + /* *uuid = xasprintf ("%016llx", (long long unsigned) data->pool_guid);*/ + snprintf(*uuid, 17, "%016llx", (long long unsigned) data->pool_guid); + zfs_unmount(data); + + return ZFS_ERR_NONE; +} + +/* + * zfs_open() locates a file in the rootpool by following the + * MOS and places the dnode of the file in the memory address DNODE. + */ +int +zfs_open(struct zfs_file *file, const char *fsfilename) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(file->device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), 0, + &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + + if (isfs) { + zfs_unmount(data); + printf("Missing @ or / separator\n"); + return ZFS_ERR_FILE_NOT_FOUND; + } + + /* We found the dnode for this file. Verify if it is a plain file. */ + if (data->dnode.dn.dn_type != DMU_OT_PLAIN_FILE_CONTENTS) { + zfs_unmount(data); + printf("not a file\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + + /* get the file size and set the file position to 0 */ + + /* + * For DMU_OT_SA we will need to locate the SIZE attribute + * attribute, which could be either in the bonus buffer + * or the "spill" block. + */ + if (data->dnode.dn.dn_bonustype == DMU_OT_SA) { + void *sahdrp; + int hdrsize; + + if (data->dnode.dn.dn_bonuslen != 0) { + sahdrp = (sa_hdr_phys_t *) DN_BONUS(&data->dnode.dn); + } else if (data->dnode.dn.dn_flags & DNODE_FLAG_SPILL_BLKPTR) { + blkptr_t *bp = &data->dnode.dn.dn_spill; + + err = zio_read(bp, data->dnode.endian, &sahdrp, NULL, data); + if (err) + return err; + } else { + printf("filesystem is corrupt :(\n"); + return ZFS_ERR_BAD_FS; + } + + hdrsize = SA_HDR_SIZE(((sa_hdr_phys_t *) sahdrp)); + file->size = *(uint64_t *) ((char *) sahdrp + hdrsize + SA_SIZE_OFFSET); + } else { + file->size = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&data->dnode.dn))->zp_size, data->dnode.endian); + } + + file->data = data; + file->offset = 0; + + return ZFS_ERR_NONE; +} + +uint64_t +zfs_read(zfs_file_t file, char *buf, uint64_t len) +{ + struct grub_zfs_data *data = (struct grub_zfs_data *) file->data; + int blksz, movesize; + uint64_t length; + int64_t red; + int err; + + if (data->file_buf == NULL) { + data->file_buf = malloc(SPA_MAXBLOCKSIZE); + if (!data->file_buf) + return -1; + data->file_start = data->file_end = 0; + } + + /* + * If offset is in memory, move it into the buffer provided and return. + */ + if (file->offset >= data->file_start + && file->offset + len <= data->file_end) { + memmove(buf, data->file_buf + file->offset - data->file_start, + len); + return len; + } + + blksz = grub_zfs_to_cpu16(data->dnode.dn.dn_datablkszsec, + data->dnode.endian) << SPA_MINBLOCKSHIFT; + + /* + * Entire Dnode is too big to fit into the space available. We + * will need to read it in chunks. This could be optimized to + * read in as large a chunk as there is space available, but for + * now, this only reads in one data block at a time. + */ + length = len; + red = 0; + while (length) { + void *t; + /* + * Find requested blkid and the offset within that block. + */ + uint64_t blkid = (file->offset + red) / blksz; + free(data->file_buf); + data->file_buf = 0; + + err = dmu_read(&(data->dnode), blkid, &t, + 0, data); + data->file_buf = t; + if (err) + return -1; + + data->file_start = blkid * blksz; + data->file_end = data->file_start + blksz; + + movesize = MIN(length, data->file_end - (int) file->offset - red); + + memmove(buf, data->file_buf + file->offset + red + - data->file_start, movesize); + buf += movesize; + length -= movesize; + red += movesize; + } + + return len; +} + +int +zfs_close(zfs_file_t file) +{ + zfs_unmount((struct grub_zfs_data *) file->data); + return ZFS_ERR_NONE; +} + +int +grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(dev); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), mdnobj, + &(data->dnode), &isfs, data); + zfs_unmount(data); + return err; +} + +static void +fill_fs_info(struct zfs_dirhook_info *info, + dnode_end_t mdn, struct grub_zfs_data *data) +{ + int err; + dnode_end_t dn; + uint64_t objnum; + uint64_t headobj; + + memset(info, 0, sizeof(*info)); + + info->dir = 1; + + if (mdn.dn.dn_type == DMU_OT_DSL_DIR) { + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&mdn.dn))->dd_head_dataset_obj, mdn.endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &mdn, data); + if (err) { + printf("zfs failed here 1\n"); + return; + } + } + make_mdn(&mdn, data); + err = dnode_get(&mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &dn, data); + if (err) { + printf("zfs failed here 2\n"); + return; + } + + err = zap_lookup(&dn, ZFS_ROOT_OBJ, &objnum, data); + if (err) { + printf("zfs failed here 3\n"); + return; + } + + err = dnode_get(&mdn, objnum, 0, &dn, data); + if (err) { + printf("zfs failed here 4\n"); + return; + } + + info->mtimeset = 1; + info->mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + + return; +} + +static int iterate_zap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t dn; + + memset(&info, 0, sizeof(info)); + + dnode_get(&(data->mdn), val, 0, &dn, data); + info.mtimeset = 1; + info.mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + info.dir = (dn.dn.dn_type == DMU_OT_DIRECTORY_CONTENTS); + debug("zfs type=%d, name=%s\n", + (int)dn.dn.dn_type, (char *)name); + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_fs(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t mdn; + int err; + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + if (mdn.dn.dn_type != DMU_OT_DSL_DIR) + return 0; + + fill_fs_info(&info, mdn, data); + + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_snap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + char *name2; + int ret = 0; + dnode_end_t mdn; + int err; + + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + + if (mdn.dn.dn_type != DMU_OT_DSL_DATASET) + return 0; + + fill_fs_info(&info, mdn, data); + + name2 = malloc(strlen(name) + 2); + name2[0] = '@'; + memcpy(name2 + 1, name, strlen(name) + 1); + if (data->userhook) + ret = data->userhook(name2, &info); + free(name2); + return ret; +} + +int +zfs_ls(device_t device, const char *path, + int (*hook)(const char *, const struct zfs_dirhook_info *)) +{ + struct grub_zfs_data *data; + int err; + int isfs; +#if 0 + char *label = NULL; + + zfs_label(device, &label); + if (label) + printf("ZPOOL label '%s'\n", + label); +#endif + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + data->userhook = hook; + + err = dnode_get_fullpath(path, &(data->mdn), 0, &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + if (isfs) { + uint64_t childobj, headobj; + uint64_t snapobj; + dnode_end_t dn; + struct zfs_dirhook_info info; + + fill_fs_info(&info, data->dnode, data); + hook("@", &info); + + childobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_child_dir_zapobj, data->dnode.endian); + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_head_dataset_obj, data->dnode.endian); + err = dnode_get(&(data->mos), childobj, + DMU_OT_DSL_DIR_CHILD_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + + zap_iterate(&dn, iterate_zap_fs, data); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&dn.dn))->ds_snapnames_zapobj, dn.endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + zap_iterate(&dn, iterate_zap_snap, data); + } else { + if (data->dnode.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + zfs_unmount(data); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + zap_iterate(&(data->dnode), iterate_zap, data); + } + zfs_unmount(data); + return ZFS_ERR_NONE; +} + diff --git a/fs/zfs/zfs_fletcher.c b/fs/zfs/zfs_fletcher.c new file mode 100644 index 0000000..d96c6ff --- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,84 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +void +fletcher_2(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint64_t *ip = buf; + const uint64_t *ipend = ip + (size / sizeof(uint64_t)); + uint64_t a0, b0, a1, b1; + + for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 += grub_zfs_to_cpu64(ip[0], endian); + a1 += grub_zfs_to_cpu64(ip[1], endian); + b0 += a0; + b1 += a1; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a0, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(a1, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(b0, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(b1, endian); +} + +void +fletcher_4(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint32_t *ip = buf; + const uint32_t *ipend = ip + (size / sizeof(uint32_t)); + uint64_t a, b, c, d; + + for (a = b = c = d = 0; ip < ipend; ip++) { + a += grub_zfs_to_cpu32(ip[0], endian); + b += a; + c += b; + d += c; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(b, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(c, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(d, endian); +} + diff --git a/fs/zfs/zfs_lzjb.c b/fs/zfs/zfs_lzjb.c new file mode 100644 index 0000000..33e9b90 --- /dev/null +++ b/fs/zfs/zfs_lzjb.c @@ -0,0 +1,94 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +#define MATCH_BITS 6 +#define MATCH_MIN 3 +#define OFFSET_MASK ((1 << (16 - MATCH_BITS)) - 1) + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + +int +lzjb_decompress(void *s_start, void *d_start, uint32_t s_len, + uint32_t d_len) +{ + uint8_t *src = s_start; + uint8_t *dst = d_start; + uint8_t *d_end = (uint8_t *) d_start + d_len; + uint8_t *s_end = (uint8_t *) s_start + s_len; + uint8_t *cpy, copymap = 0; + int copymask = 1 << (NBBY - 1); + + while (dst < d_end && src < s_end) { + if ((copymask <<= 1) == (1 << NBBY)) { + copymask = 1; + copymap = *src++; + } + if (src >= s_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + if (copymap & copymask) { + int mlen = (src[0] >> (NBBY - MATCH_BITS)) + MATCH_MIN; + int offset = ((src[0] << NBBY) | src[1]) & OFFSET_MASK; + src += 2; + cpy = dst - offset; + if (src > s_end || cpy < (uint8_t *) d_start) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + while (--mlen >= 0 && dst < d_end) + *dst++ = *cpy++; + } else { + *dst++ = *src++; + } + } + if (dst < d_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; +} diff --git a/fs/zfs/zfs_sha256.c b/fs/zfs/zfs_sha256.c new file mode 100644 index 0000000..7a9439a --- /dev/null +++ b/fs/zfs/zfs_sha256.c @@ -0,0 +1,145 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +/* + * SHA-256 checksum, as specified in FIPS 180-2, available at: + * http://csrc.nist.gov/cryptval + * + * This is a very compact implementation of SHA-256. + * It is designed to be simple and portable, not to be fast. + */ + +/* + * The literal definitions according to FIPS180-2 would be: + * + * Ch(x, y, z) (((x) & (y)) ^ ((~(x)) & (z))) + * Maj(x, y, z) (((x) & (y)) | ((x) & (z)) | ((y) & (z))) + * + * We use logical equivalents which require one less op. + */ +#define Ch(x, y, z) ((z) ^ ((x) & ((y) ^ (z)))) +#define Maj(x, y, z) (((x) & (y)) ^ ((z) & ((x) ^ (y)))) +#define Rot32(x, s) (((x) >> s) | ((x) << (32 - s))) +#define SIGMA0(x) (Rot32(x, 2) ^ Rot32(x, 13) ^ Rot32(x, 22)) +#define SIGMA1(x) (Rot32(x, 6) ^ Rot32(x, 11) ^ Rot32(x, 25)) +#define sigma0(x) (Rot32(x, 7) ^ Rot32(x, 18) ^ ((x) >> 3)) +#define sigma1(x) (Rot32(x, 17) ^ Rot32(x, 19) ^ ((x) >> 10)) + +static const uint32_t SHA256_K[64] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +}; + +static void +SHA256Transform(uint32_t *H, const uint8_t *cp) +{ + uint32_t a, b, c, d, e, f, g, h, t, T1, T2, W[64]; + + for (t = 0; t < 16; t++, cp += 4) + W[t] = (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | cp[3]; + + for (t = 16; t < 64; t++) + W[t] = sigma1(W[t - 2]) + W[t - 7] + + sigma0(W[t - 15]) + W[t - 16]; + + a = H[0]; b = H[1]; c = H[2]; d = H[3]; + e = H[4]; f = H[5]; g = H[6]; h = H[7]; + + for (t = 0; t < 64; t++) { + T1 = h + SIGMA1(e) + Ch(e, f, g) + SHA256_K[t] + W[t]; + T2 = SIGMA0(a) + Maj(a, b, c); + h = g; g = f; f = e; e = d + T1; + d = c; c = b; b = a; a = T1 + T2; + } + + H[0] += a; H[1] += b; H[2] += c; H[3] += d; + H[4] += e; H[5] += f; H[6] += g; H[7] += h; +} + +void +zio_checksum_SHA256(const void *buf, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp) +{ + uint32_t H[8] = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, + 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 }; + uint8_t pad[128]; + unsigned padsize = size & 63; + unsigned i; + + for (i = 0; i < size - padsize; i += 64) + SHA256Transform(H, (uint8_t *)buf + i); + + for (i = 0; i < padsize; i++) + pad[i] = ((uint8_t *)buf)[i]; + + for (pad[padsize++] = 0x80; (padsize & 63) != 56; padsize++) + pad[padsize] = 0; + + for (i = 0; i < 8; i++) + pad[padsize++] = (size << 3) >> (56 - 8 * i); + + for (i = 0; i < padsize; i += 64) + SHA256Transform(H, pad + i); + + zcp->zc_word[0] = grub_cpu_to_zfs64((uint64_t)H[0] << 32 | H[1], + endian); + zcp->zc_word[1] = grub_cpu_to_zfs64((uint64_t)H[2] << 32 | H[3], + endian); + zcp->zc_word[2] = grub_cpu_to_zfs64((uint64_t)H[4] << 32 | H[5], + endian); + zcp->zc_word[3] = grub_cpu_to_zfs64((uint64_t)H[6] << 32 | H[7], + endian); +} diff --git a/include/config_cmd_all.h b/include/config_cmd_all.h index 55f4f7a..5933ae9 100644 --- a/include/config_cmd_all.h +++ b/include/config_cmd_all.h @@ -36,6 +36,7 @@ #define CONFIG_CMD_ELF /* ELF (VxWorks) load/boot cmd */ #define CONFIG_CMD_EXT2 /* EXT2 Support */ #define CONFIG_CMD_FAT /* FAT support */ +#define CONFIG_CMD_ZFS /* ZFS support */ #define CONFIG_CMD_FDC /* Floppy Disk Support */ #define CONFIG_CMD_FDOS /* Floppy DOS support */ #define CONFIG_CMD_FLASH /* flinfo, erase, protect */ diff --git a/include/zfs/dmu.h b/include/zfs/dmu.h new file mode 100644 index 0000000..bee317e --- /dev/null +++ b/include/zfs/dmu.h @@ -0,0 +1,119 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_H +#define _SYS_DMU_H + +/* + * This file describes the interface that the DMU provides for its + * consumers. + * + * The DMU also interacts with the SPA. That interface is described in + * dmu_spa.h. + */ +typedef enum dmu_object_type { + DMU_OT_NONE, + /* general: */ + DMU_OT_OBJECT_DIRECTORY, /* ZAP */ + DMU_OT_OBJECT_ARRAY, /* UINT64 */ + DMU_OT_PACKED_NVLIST, /* UINT8 (XDR by nvlist_pack/unpack) */ + DMU_OT_PACKED_NVLIST_SIZE, /* UINT64 */ + DMU_OT_BPLIST, /* UINT64 */ + DMU_OT_BPLIST_HDR, /* UINT64 */ + /* spa: */ + DMU_OT_SPACE_MAP_HEADER, /* UINT64 */ + DMU_OT_SPACE_MAP, /* UINT64 */ + /* zil: */ + DMU_OT_INTENT_LOG, /* UINT64 */ + /* dmu: */ + DMU_OT_DNODE, /* DNODE */ + DMU_OT_OBJSET, /* OBJSET */ + /* dsl: */ + DMU_OT_DSL_DIR, /* UINT64 */ + DMU_OT_DSL_DIR_CHILD_MAP, /* ZAP */ + DMU_OT_DSL_DS_SNAP_MAP, /* ZAP */ + DMU_OT_DSL_PROPS, /* ZAP */ + DMU_OT_DSL_DATASET, /* UINT64 */ + /* zpl: */ + DMU_OT_ZNODE, /* ZNODE */ + DMU_OT_OLDACL, /* OLD ACL */ + DMU_OT_PLAIN_FILE_CONTENTS, /* UINT8 */ + DMU_OT_DIRECTORY_CONTENTS, /* ZAP */ + DMU_OT_MASTER_NODE, /* ZAP */ + DMU_OT_UNLINKED_SET, /* ZAP */ + /* zvol: */ + DMU_OT_ZVOL, /* UINT8 */ + DMU_OT_ZVOL_PROP, /* ZAP */ + /* other; for testing only! */ + DMU_OT_PLAIN_OTHER, /* UINT8 */ + DMU_OT_UINT64_OTHER, /* UINT64 */ + DMU_OT_ZAP_OTHER, /* ZAP */ + /* new object types: */ + DMU_OT_ERROR_LOG, /* ZAP */ + DMU_OT_SPA_HISTORY, /* UINT8 */ + DMU_OT_SPA_HISTORY_OFFSETS, /* spa_his_phys_t */ + DMU_OT_POOL_PROPS, /* ZAP */ + DMU_OT_DSL_PERMS, /* ZAP */ + DMU_OT_ACL, /* ACL */ + DMU_OT_SYSACL, /* SYSACL */ + DMU_OT_FUID, /* FUID table (Packed NVLIST UINT8) */ + DMU_OT_FUID_SIZE, /* FUID table size UINT64 */ + DMU_OT_NEXT_CLONES, /* ZAP */ + DMU_OT_SCRUB_QUEUE, /* ZAP */ + DMU_OT_USERGROUP_USED, /* ZAP */ + DMU_OT_USERGROUP_QUOTA, /* ZAP */ + DMU_OT_USERREFS, /* ZAP */ + DMU_OT_DDT_ZAP, /* ZAP */ + DMU_OT_DDT_STATS, /* ZAP */ + DMU_OT_SA, /* System attr */ + DMU_OT_SA_MASTER_NODE, /* ZAP */ + DMU_OT_SA_ATTR_REGISTRATION, /* ZAP */ + DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ + DMU_OT_NUMTYPES +} dmu_object_type_t; + +typedef enum dmu_objset_type { + DMU_OST_NONE, + DMU_OST_META, + DMU_OST_ZFS, + DMU_OST_ZVOL, + DMU_OST_OTHER, /* For testing only! */ + DMU_OST_ANY, /* Be careful! */ + DMU_OST_NUMTYPES +} dmu_objset_type_t; + +/* + * The names of zap entries in the DIRECTORY_OBJECT of the MOS. + */ +#define DMU_POOL_DIRECTORY_OBJECT 1 +#define DMU_POOL_CONFIG "config" +#define DMU_POOL_ROOT_DATASET "root_dataset" +#define DMU_POOL_SYNC_BPLIST "sync_bplist" +#define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" +#define DMU_POOL_ERRLOG_LAST "errlog_last" +#define DMU_POOL_SPARES "spares" +#define DMU_POOL_DEFLATE "deflate" +#define DMU_POOL_HISTORY "history" +#define DMU_POOL_PROPS "pool_props" +#define DMU_POOL_L2CACHE "l2cache" + +#endif /* _SYS_DMU_H */ diff --git a/include/zfs/dmu_objset.h b/include/zfs/dmu_objset.h new file mode 100644 index 0000000..176cad7 --- /dev/null +++ b/include/zfs/dmu_objset.h @@ -0,0 +1,43 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * Copyright (C) 2010 Robert Millan rmh@gnu.org + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_OBJSET_H +#define _SYS_DMU_OBJSET_H + +#include <zfs/zil.h> + +#define OBJSET_PHYS_SIZE 2048 +#define OBJSET_PHYS_SIZE_V14 1024 + +typedef struct objset_phys { + dnode_phys_t os_meta_dnode; + zil_header_t os_zil_header; + uint64_t os_type; + uint64_t os_flags; + char os_pad[OBJSET_PHYS_SIZE - sizeof(dnode_phys_t)*3 - + sizeof(zil_header_t) - sizeof(uint64_t)*2]; + dnode_phys_t os_userused_dnode; + dnode_phys_t os_groupused_dnode; +} objset_phys_t; + +#endif /* _SYS_DMU_OBJSET_H */ diff --git a/include/zfs/dnode.h b/include/zfs/dnode.h new file mode 100644 index 0000000..9ec3d43 --- /dev/null +++ b/include/zfs/dnode.h @@ -0,0 +1,80 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DNODE_H +#define _SYS_DNODE_H + +#include <zfs/spa.h> + +/* + * Fixed constants. + */ +#define DNODE_SHIFT 9 /* 512 bytes */ +#define DN_MIN_INDBLKSHIFT 10 /* 1k */ +#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DNODE_BLOCK_SHIFT 14 /* 16k */ +#define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ +#define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ +#define DN_MAX_OFFSET_SHIFT 64 /* 2^64 bytes in a dnode */ + +/* + * Derived constants. + */ +#define DNODE_SIZE (1 << DNODE_SHIFT) +#define DN_MAX_NBLKPTR ((DNODE_SIZE - DNODE_CORE_SIZE) >> SPA_BLKPTRSHIFT) +#define DN_MAX_BONUSLEN (DNODE_SIZE - DNODE_CORE_SIZE - (1 << SPA_BLKPTRSHIFT)) +#define DN_MAX_OBJECT (1ULL << DN_MAX_OBJECT_SHIFT) + +#define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) +#define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) +#define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) + +#define DNODE_FLAG_SPILL_BLKPTR (1<<2) + +#define DN_BONUS(dnp) ((void *)((dnp)->dn_bonus + \ + (((dnp)->dn_nblkptr - 1) * sizeof(blkptr_t)))) + +typedef struct dnode_phys { + uint8_t dn_type; /* dmu_object_type_t */ + uint8_t dn_indblkshift; /* ln2(indirect block size) */ + uint8_t dn_nlevels; /* 1=dn_blkptr->data blocks */ + uint8_t dn_nblkptr; /* length of dn_blkptr */ + uint8_t dn_bonustype; /* type of data in bonus buffer */ + uint8_t dn_checksum; /* ZIO_CHECKSUM type */ + uint8_t dn_compress; /* ZIO_COMPRESS type */ + uint8_t dn_flags; /* DNODE_FLAG_* */ + uint16_t dn_datablkszsec; /* data block size in 512b sectors */ + uint16_t dn_bonuslen; /* length of dn_bonus */ + uint8_t dn_pad2[4]; + + /* accounting is protected by dn_dirty_mtx */ + uint64_t dn_maxblkid; /* largest allocated block ID */ + uint64_t dn_used; /* bytes (or sectors) of disk space */ + + uint64_t dn_pad3[4]; + + blkptr_t dn_blkptr[1]; + uint8_t dn_bonus[DN_MAX_BONUSLEN - sizeof(blkptr_t)]; + blkptr_t dn_spill; +} dnode_phys_t; + +#endif /* _SYS_DNODE_H */ diff --git a/include/zfs/dsl_dataset.h b/include/zfs/dsl_dataset.h new file mode 100644 index 0000000..c6de7ab --- /dev/null +++ b/include/zfs/dsl_dataset.h @@ -0,0 +1,52 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DATASET_H +#define _SYS_DSL_DATASET_H + +typedef struct dsl_dataset_phys { + uint64_t ds_dir_obj; + uint64_t ds_prev_snap_obj; + uint64_t ds_prev_snap_txg; + uint64_t ds_next_snap_obj; + uint64_t ds_snapnames_zapobj; /* zap obj of snaps; ==0 for snaps */ + uint64_t ds_num_children; /* clone/snap children; ==0 for head */ + uint64_t ds_creation_time; /* seconds since 1970 */ + uint64_t ds_creation_txg; + uint64_t ds_deadlist_obj; + uint64_t ds_used_bytes; + uint64_t ds_compressed_bytes; + uint64_t ds_uncompressed_bytes; + uint64_t ds_unique_bytes; /* only relevant to snapshots */ + /* + * The ds_fsid_guid is a 56-bit ID that can change to avoid + * collisions. The ds_guid is a 64-bit ID that will never + * change, so there is a small probability that it will collide. + */ + uint64_t ds_fsid_guid; + uint64_t ds_guid; + uint64_t ds_flags; + blkptr_t ds_bp; + uint64_t ds_pad[8]; /* pad out to 320 bytes for good measure */ +} dsl_dataset_phys_t; + +#endif /* _SYS_DSL_DATASET_H */ diff --git a/include/zfs/dsl_dir.h b/include/zfs/dsl_dir.h new file mode 100644 index 0000000..c04e0b6 --- /dev/null +++ b/include/zfs/dsl_dir.h @@ -0,0 +1,48 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DIR_H +#define _SYS_DSL_DIR_H + +typedef struct dsl_dir_phys { + uint64_t dd_creation_time; /* not actually used */ + uint64_t dd_head_dataset_obj; + uint64_t dd_parent_obj; + uint64_t dd_clone_parent_obj; + uint64_t dd_child_dir_zapobj; + /* + * how much space our children are accounting for; for leaf + * datasets, == physical space used by fs + snaps + */ + uint64_t dd_used_bytes; + uint64_t dd_compressed_bytes; + uint64_t dd_uncompressed_bytes; + /* Administrative quota setting */ + uint64_t dd_quota; + /* Administrative reservation setting */ + uint64_t dd_reserved; + uint64_t dd_props_zapobj; + uint64_t dd_deleg_zapobj; /* dataset permissions */ + uint64_t dd_pad[20]; /* pad out to 256 bytes for good measure */ +} dsl_dir_phys_t; + +#endif /* _SYS_DSL_DIR_H */ diff --git a/include/zfs/sa_impl.h b/include/zfs/sa_impl.h new file mode 100644 index 0000000..4ec49fe --- /dev/null +++ b/include/zfs/sa_impl.h @@ -0,0 +1,34 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ +#ifndef _SYS_SA_IMPL_H +#define _SYS_SA_IMPL_H + +typedef struct sa_hdr_phys { + uint32_t sa_magic; + uint16_t sa_layout_info; + uint16_t sa_lengths[1]; +} sa_hdr_phys_t; + +#define SA_HDR_SIZE(hdr) BF32_GET_SB(hdr->sa_layout_info, 10, 16, 3, 0) +#define SA_SIZE_OFFSET 0x8 + +#endif /* _SYS_SA_IMPL_H */ diff --git a/include/zfs/spa.h b/include/zfs/spa.h new file mode 100644 index 0000000..100e2a6 --- /dev/null +++ b/include/zfs/spa.h @@ -0,0 +1,311 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2010 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef GRUB_ZFS_SPA_HEADER +#define GRUB_ZFS_SPA_HEADER 1 + +typedef enum grub_zfs_endian { + UNKNOWN_ENDIAN = -2, + LITTLE_ENDIAN = -1, + BIG_ENDIAN = 0 +} grub_zfs_endian_t; + + +#define grub_zfs_to_cpu16(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu16(x) \ + : grub_le_to_cpu16(x)) +#define grub_cpu_to_zfs16(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be16(x) \ + : grub_cpu_to_le16(x)) + +#define grub_zfs_to_cpu32(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu32(x) \ + : grub_le_to_cpu32(x)) +#define grub_cpu_to_zfs32(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be32(x) \ + : grub_cpu_to_le32(x)) + +#define grub_zfs_to_cpu64(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu64(x) \ + : grub_le_to_cpu64(x)) +#define grub_cpu_to_zfs64(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be64(x) \ + : grub_cpu_to_le64(x)) + +/* + * General-purpose 32-bit and 64-bit bitfield encodings. + */ +#define BF32_DECODE(x, low, len) P2PHASE((x) >> (low), 1U << (len)) +#define BF64_DECODE(x, low, len) P2PHASE((x) >> (low), 1ULL << (len)) +#define BF32_ENCODE(x, low, len) (P2PHASE((x), 1U << (len)) << (low)) +#define BF64_ENCODE(x, low, len) (P2PHASE((x), 1ULL << (len)) << (low)) + +#define BF32_GET(x, low, len) BF32_DECODE(x, low, len) +#define BF64_GET(x, low, len) BF64_DECODE(x, low, len) + +#define BF32_SET(x, low, len, val) \ + ((x) ^= BF32_ENCODE((x >> low) ^ (val), low, len)) +#define BF64_SET(x, low, len, val) \ + ((x) ^= BF64_ENCODE((x >> low) ^ (val), low, len)) + +#define BF32_GET_SB(x, low, len, shift, bias) \ + ((BF32_GET(x, low, len) + (bias)) << (shift)) +#define BF64_GET_SB(x, low, len, shift, bias) \ + ((BF64_GET(x, low, len) + (bias)) << (shift)) + +#define BF32_SET_SB(x, low, len, shift, bias, val) \ + BF32_SET(x, low, len, ((val) >> (shift)) - (bias)) +#define BF64_SET_SB(x, low, len, shift, bias, val) \ + BF64_SET(x, low, len, ((val) >> (shift)) - (bias)) + +/* + * We currently support nine block sizes, from 512 bytes to 128K. + * We could go higher, but the benefits are near-zero and the cost + * of COWing a giant block to modify one byte would become excessive. + */ +#define SPA_MINBLOCKSHIFT 9 +#define SPA_MAXBLOCKSHIFT 17 +#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT) +#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT) + +#define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) + +/* + * Size of block to hold the configuration data (a packed nvlist) + */ +#define SPA_CONFIG_BLOCKSIZE (1 << 14) + +/* + * The DVA size encodings for LSIZE and PSIZE support blocks up to 32MB. + * The ASIZE encoding should be at least 64 times larger (6 more bits) + * to support up to 4-way RAID-Z mirror mode with worst-case gang block + * overhead, three DVAs per bp, plus one more bit in case we do anything + * else that expands the ASIZE. + */ +#define SPA_LSIZEBITS 16 /* LSIZE up to 32M (2^16 * 512) */ +#define SPA_PSIZEBITS 16 /* PSIZE up to 32M (2^16 * 512) */ +#define SPA_ASIZEBITS 24 /* ASIZE up to 64 times larger */ + +/* + * All SPA data is represented by 128-bit data virtual addresses (DVAs). + * The members of the dva_t should be considered opaque outside the SPA. + */ +typedef struct dva { + uint64_t dva_word[2]; +} dva_t; + +/* + * Each block has a 256-bit checksum -- strong enough for cryptographic hashes. + */ +typedef struct zio_cksum { + uint64_t zc_word[4]; +} zio_cksum_t; + +/* + * Each block is described by its DVAs, time of birth, checksum, etc. + * The word-by-word, bit-by-bit layout of the blkptr is as follows: + * + * 64 56 48 40 32 24 16 8 0 + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 0 | vdev1 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 1 |G| offset1 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 2 | vdev2 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 3 |G| offset2 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 4 | vdev3 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 5 |G| offset3 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 6 |BDX|lvl| type | cksum | comp | PSIZE | LSIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 7 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 8 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 9 | physical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * a | logical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * b | fill count | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * c | checksum[0] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * d | checksum[1] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * e | checksum[2] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * f | checksum[3] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * + * Legend: + * + * vdev virtual device ID + * offset offset into virtual device + * LSIZE logical size + * PSIZE physical size (after compression) + * ASIZE allocated size (including RAID-Z parity and gang block headers) + * GRID RAID-Z layout information (reserved for future use) + * cksum checksum function + * comp compression function + * G gang block indicator + * B byteorder (endianness) + * D dedup + * X unused + * lvl level of indirection + * type DMU object type + * phys birth txg of block allocation; zero if same as logical birth txg + * log. birth transaction group in which the block was logically born + * fill count number of non-zero blocks under this bp + * checksum[4] 256-bit checksum of the data this bp describes + */ +#define SPA_BLKPTRSHIFT 7 /* blkptr_t is 128 bytes */ +#define SPA_DVAS_PER_BP 3 /* Number of DVAs in a bp */ + +typedef struct blkptr { + dva_t blk_dva[SPA_DVAS_PER_BP]; /* Data Virtual Addresses */ + uint64_t blk_prop; /* size, compression, type, etc */ + uint64_t blk_pad[2]; /* Extra space for the future */ + uint64_t blk_phys_birth; /* txg when block was allocated */ + uint64_t blk_birth; /* transaction group at birth */ + uint64_t blk_fill; /* fill count */ + zio_cksum_t blk_cksum; /* 256-bit checksum */ +} blkptr_t; + +/* + * Macros to get and set fields in a bp or DVA. + */ +#define DVA_GET_ASIZE(dva) \ + BF64_GET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0) +#define DVA_SET_ASIZE(dva, x) \ + BF64_SET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0, x) + +#define DVA_GET_GRID(dva) BF64_GET((dva)->dva_word[0], 24, 8) +#define DVA_SET_GRID(dva, x) BF64_SET((dva)->dva_word[0], 24, 8, x) + +#define DVA_GET_VDEV(dva) BF64_GET((dva)->dva_word[0], 32, 32) +#define DVA_SET_VDEV(dva, x) BF64_SET((dva)->dva_word[0], 32, 32, x) + +#define DVA_GET_GANG(dva) BF64_GET((dva)->dva_word[1], 63, 1) +#define DVA_SET_GANG(dva, x) BF64_SET((dva)->dva_word[1], 63, 1, x) + +#define BP_GET_LSIZE(bp) \ + BF64_GET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1) +#define BP_SET_LSIZE(bp, x) \ + BF64_SET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1, x) + +#define BP_GET_COMPRESS(bp) BF64_GET((bp)->blk_prop, 32, 8) +#define BP_SET_COMPRESS(bp, x) BF64_SET((bp)->blk_prop, 32, 8, x) + +#define BP_GET_CHECKSUM(bp) BF64_GET((bp)->blk_prop, 40, 8) +#define BP_SET_CHECKSUM(bp, x) BF64_SET((bp)->blk_prop, 40, 8, x) + +#define BP_GET_TYPE(bp) BF64_GET((bp)->blk_prop, 48, 8) +#define BP_SET_TYPE(bp, x) BF64_SET((bp)->blk_prop, 48, 8, x) + +#define BP_GET_LEVEL(bp) BF64_GET((bp)->blk_prop, 56, 5) +#define BP_SET_LEVEL(bp, x) BF64_SET((bp)->blk_prop, 56, 5, x) + +#define BP_GET_PROP_BIT_61(bp) BF64_GET((bp)->blk_prop, 61, 1) +#define BP_SET_PROP_BIT_61(bp, x) BF64_SET((bp)->blk_prop, 61, 1, x) + +#define BP_GET_DEDUP(bp) BF64_GET((bp)->blk_prop, 62, 1) +#define BP_SET_DEDUP(bp, x) BF64_SET((bp)->blk_prop, 62, 1, x) + +#define BP_GET_BYTEORDER(bp) (0 - BF64_GET((bp)->blk_prop, 63, 1)) +#define BP_SET_BYTEORDER(bp, x) BF64_SET((bp)->blk_prop, 63, 1, x) + +#define BP_PHYSICAL_BIRTH(bp) \ + ((bp)->blk_phys_birth ? (bp)->blk_phys_birth : (bp)->blk_birth) + +#define BP_SET_BIRTH(bp, logical, physical) \ + { \ + (bp)->blk_birth = (logical); \ + (bp)->blk_phys_birth = ((logical) == (physical) ? 0 : (physical)); \ + } + +#define BP_GET_ASIZE(bp) \ + (DVA_GET_ASIZE(&(bp)->blk_dva[0]) + DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_GET_UCSIZE(bp) \ + ((BP_GET_LEVEL(bp) > 0 || dmu_ot[BP_GET_TYPE(bp)].ot_metadata) ? \ + BP_GET_PSIZE(bp) : BP_GET_LSIZE(bp)); + +#define BP_GET_NDVAS(bp) \ + (!!DVA_GET_ASIZE(&(bp)->blk_dva[0]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_COUNT_GANG(bp) \ + (DVA_GET_GANG(&(bp)->blk_dva[0]) + \ + DVA_GET_GANG(&(bp)->blk_dva[1]) + \ + DVA_GET_GANG(&(bp)->blk_dva[2])) + +#define DVA_EQUAL(dva1, dva2) \ + ((dva1)->dva_word[1] == (dva2)->dva_word[1] && \ + (dva1)->dva_word[0] == (dva2)->dva_word[0]) + +#define BP_EQUAL(bp1, bp2) \ + (BP_PHYSICAL_BIRTH(bp1) == BP_PHYSICAL_BIRTH(bp2) && \ + DVA_EQUAL(&(bp1)->blk_dva[0], &(bp2)->blk_dva[0]) && \ + DVA_EQUAL(&(bp1)->blk_dva[1], &(bp2)->blk_dva[1]) && \ + DVA_EQUAL(&(bp1)->blk_dva[2], &(bp2)->blk_dva[2])) + +#define ZIO_CHECKSUM_EQUAL(zc1, zc2) \ + (0 == (((zc1).zc_word[0] - (zc2).zc_word[0]) | \ + ((zc1).zc_word[1] - (zc2).zc_word[1]) | \ + ((zc1).zc_word[2] - (zc2).zc_word[2]) | \ + ((zc1).zc_word[3] - (zc2).zc_word[3]))) + +#define DVA_IS_VALID(dva) (DVA_GET_ASIZE(dva) != 0) + +#define ZIO_SET_CHECKSUM(zcp, w0, w1, w2, w3) \ + { \ + (zcp)->zc_word[0] = w0; \ + (zcp)->zc_word[1] = w1; \ + (zcp)->zc_word[2] = w2; \ + (zcp)->zc_word[3] = w3; \ + } + +#define BP_IDENTITY(bp) (&(bp)->blk_dva[0]) +#define BP_IS_GANG(bp) DVA_GET_GANG(BP_IDENTITY(bp)) +#define BP_IS_HOLE(bp) ((bp)->blk_birth == 0) + +/* BP_IS_RAIDZ(bp) assumes no block compression */ +#define BP_IS_RAIDZ(bp) (DVA_GET_ASIZE(&(bp)->blk_dva[0]) > \ + BP_GET_PSIZE(bp)) + +#define BP_ZERO(bp) \ + { \ + (bp)->blk_dva[0].dva_word[0] = 0; \ + (bp)->blk_dva[0].dva_word[1] = 0; \ + (bp)->blk_dva[1].dva_word[0] = 0; \ + (bp)->blk_dva[1].dva_word[1] = 0; \ + (bp)->blk_dva[2].dva_word[0] = 0; \ + (bp)->blk_dva[2].dva_word[1] = 0; \ + (bp)->blk_prop = 0; \ + (bp)->blk_pad[0] = 0; \ + (bp)->blk_pad[1] = 0; \ + (bp)->blk_phys_birth = 0; \ + (bp)->blk_birth = 0; \ + (bp)->blk_fill = 0; \ + ZIO_SET_CHECKSUM(&(bp)->blk_cksum, 0, 0, 0, 0); \ + } + +#define BP_SPRINTF_LEN 320 + +#endif /* ! GRUB_ZFS_SPA_HEADER */ diff --git a/include/zfs/uberblock_impl.h b/include/zfs/uberblock_impl.h new file mode 100644 index 0000000..12daf98 --- /dev/null +++ b/include/zfs/uberblock_impl.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_UBERBLOCK_IMPL_H +#define _SYS_UBERBLOCK_IMPL_H + +#define UBMAX(a, b) ((a) > (b) ? (a) : (b)) + +/* + * The uberblock version is incremented whenever an incompatible on-disk + * format change is made to the SPA, DMU, or ZAP. + * + * Note: the first two fields should never be moved. When a storage pool + * is opened, the uberblock must be read off the disk before the version + * can be checked. If the ub_version field is moved, we may not detect + * version mismatch. If the ub_magic field is moved, applications that + * expect the magic number in the first word won't work. + */ +#define UBERBLOCK_MAGIC 0x00bab10c /* oo-ba-bloc! */ +#define UBERBLOCK_SHIFT 10 /* up to 1K */ + +typedef struct uberblock { + uint64_t ub_magic; /* UBERBLOCK_MAGIC */ + uint64_t ub_version; /* ZFS_VERSION */ + uint64_t ub_txg; /* txg of last sync */ + uint64_t ub_guid_sum; /* sum of all vdev guids */ + uint64_t ub_timestamp; /* UTC time of last sync */ + blkptr_t ub_rootbp; /* MOS objset_phys_t */ +} uberblock_t; + +#define VDEV_UBERBLOCK_SHIFT(as) UBMAX(as, UBERBLOCK_SHIFT) +#define UBERBLOCK_SIZE(as) (1ULL << VDEV_UBERBLOCK_SHIFT(as)) + +/* Number of uberblocks that can fit in the ring at a given ashift */ +#define UBERBLOCK_COUNT(as) (VDEV_UBERBLOCK_RING >> VDEV_UBERBLOCK_SHIFT(as)) + +#endif /* _SYS_UBERBLOCK_IMPL_H */ diff --git a/include/zfs/vdev_impl.h b/include/zfs/vdev_impl.h new file mode 100644 index 0000000..97033c9 --- /dev/null +++ b/include/zfs/vdev_impl.h @@ -0,0 +1,69 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_VDEV_IMPL_H +#define _SYS_VDEV_IMPL_H + +#define VDEV_SKIP_SIZE (8 << 10) +#define VDEV_BOOT_HEADER_SIZE (8 << 10) +#define VDEV_PHYS_SIZE (112 << 10) +#define VDEV_UBERBLOCK_RING (128 << 10) + +/* ZFS boot block */ +#define VDEV_BOOT_MAGIC 0x2f5b007b10cULL +#define VDEV_BOOT_VERSION 1 /* version number */ + +typedef struct vdev_boot_header { + uint64_t vb_magic; /* VDEV_BOOT_MAGIC */ + uint64_t vb_version; /* VDEV_BOOT_VERSION */ + uint64_t vb_offset; /* start offset (bytes) */ + uint64_t vb_size; /* size (bytes) */ + char vb_pad[VDEV_BOOT_HEADER_SIZE - 4 * sizeof(uint64_t)]; +} vdev_boot_header_t; + +typedef struct vdev_phys { + char vp_nvlist[VDEV_PHYS_SIZE - sizeof(zio_eck_t)]; + zio_eck_t vp_zbt; +} vdev_phys_t; + +typedef struct vdev_label { + char vl_pad[VDEV_SKIP_SIZE]; /* 8K */ + vdev_boot_header_t vl_boot_header; /* 8K */ + vdev_phys_t vl_vdev_phys; /* 112K */ + char vl_uberblock[VDEV_UBERBLOCK_RING]; /* 128K */ +} vdev_label_t; /* 256K total */ + +/* + * Size and offset of embedded boot loader region on each label. + * The total size of the first two labels plus the boot area is 4MB. + */ +#define VDEV_BOOT_OFFSET (2 * sizeof(vdev_label_t)) +#define VDEV_BOOT_SIZE (7ULL << 19) /* 3.5M */ + +/* + * Size of label regions at the start and end of each leaf device. + */ +#define VDEV_LABEL_START_SIZE (2 * sizeof(vdev_label_t) + VDEV_BOOT_SIZE) +#define VDEV_LABEL_END_SIZE (2 * sizeof(vdev_label_t)) +#define VDEV_LABELS 4 + +#endif /* _SYS_VDEV_IMPL_H */ diff --git a/include/zfs/zap_impl.h b/include/zfs/zap_impl.h new file mode 100644 index 0000000..65e9311 --- /dev/null +++ b/include/zfs/zap_impl.h @@ -0,0 +1,112 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_IMPL_H +#define _SYS_ZAP_IMPL_H + +#define ZAP_MAGIC 0x2F52AB2ABULL + +#define ZAP_HASHBITS 28 +#define MZAP_ENT_LEN 64 +#define MZAP_NAME_LEN (MZAP_ENT_LEN - 8 - 4 - 2) +#define MZAP_MAX_BLKSHIFT SPA_MAXBLOCKSHIFT +#define MZAP_MAX_BLKSZ (1 << MZAP_MAX_BLKSHIFT) + +typedef struct mzap_ent_phys { + uint64_t mze_value; + uint32_t mze_cd; + uint16_t mze_pad; /* in case we want to chain them someday */ + char mze_name[MZAP_NAME_LEN]; +} mzap_ent_phys_t; + +typedef struct mzap_phys { + uint64_t mz_block_type; /* ZBT_MICRO */ + uint64_t mz_salt; + uint64_t mz_pad[6]; + mzap_ent_phys_t mz_chunk[1]; + /* actually variable size depending on block size */ +} mzap_phys_t; + +/* + * The (fat) zap is stored in one object. It is an array of + * 1<<FZAP_BLOCK_SHIFT byte blocks. The layout looks like one of: + * + * ptrtbl fits in first block: + * [zap_phys_t zap_ptrtbl_shift < 6] [zap_leaf_t] ... + * + * ptrtbl too big for first block: + * [zap_phys_t zap_ptrtbl_shift >= 6] [zap_leaf_t] [ptrtbl] ... + * + */ + +#define ZBT_LEAF ((1ULL << 63) + 0) +#define ZBT_HEADER ((1ULL << 63) + 1) +#define ZBT_MICRO ((1ULL << 63) + 3) +/* any other values are ptrtbl blocks */ + +/* + * the embedded pointer table takes up half a block: + * block size / entry size (2^3) / 2 + */ +#define ZAP_EMBEDDED_PTRTBL_SHIFT(zap) (FZAP_BLOCK_SHIFT(zap) - 3 - 1) + +/* + * The embedded pointer table starts half-way through the block. Since + * the pointer table itself is half the block, it starts at (64-bit) + * word number (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap)). + */ +#define ZAP_EMBEDDED_PTRTBL_ENT(zap, idx) \ + ((uint64_t *)(zap)->zap_f.zap_phys) \ + [(idx) + (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap))] + +/* + * TAKE NOTE: + * If zap_phys_t is modified, zap_byteswap() must be modified. + */ +typedef struct zap_phys { + uint64_t zap_block_type; /* ZBT_HEADER */ + uint64_t zap_magic; /* ZAP_MAGIC */ + + struct zap_table_phys { + uint64_t zt_blk; /* starting block number */ + uint64_t zt_numblks; /* number of blocks */ + uint64_t zt_shift; /* bits to index it */ + uint64_t zt_nextblk; /* next (larger) copy start block */ + uint64_t zt_blks_copied; /* number source blocks copied */ + } zap_ptrtbl; + + uint64_t zap_freeblk; /* the next free block */ + uint64_t zap_num_leafs; /* number of leafs */ + uint64_t zap_num_entries; /* number of entries */ + uint64_t zap_salt; /* salt to stir into hash function */ + uint64_t zap_normflags; /* flags for u8_textprep_str() */ + uint64_t zap_flags; /* zap_flag_t */ + /* + * This structure is followed by padding, and then the embedded + * pointer table. The embedded pointer table takes up second + * half of the block. It is accessed using the + * ZAP_EMBEDDED_PTRTBL_ENT() macro. + */ +} zap_phys_t; + +#endif /* _SYS_ZAP_IMPL_H */ diff --git a/include/zfs/zap_leaf.h b/include/zfs/zap_leaf.h new file mode 100644 index 0000000..4ddddb5 --- /dev/null +++ b/include/zfs/zap_leaf.h @@ -0,0 +1,103 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_LEAF_H +#define _SYS_ZAP_LEAF_H + +#define ZAP_LEAF_MAGIC 0x2AB1EAF + +/* chunk size = 24 bytes */ +#define ZAP_LEAF_CHUNKSIZE 24 + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +typedef enum zap_chunk_type { + ZAP_CHUNK_FREE = 253, + ZAP_CHUNK_ENTRY = 252, + ZAP_CHUNK_ARRAY = 251, + ZAP_CHUNK_TYPE_MAX = 250 +} zap_chunk_type_t; + +/* + * TAKE NOTE: + * If zap_leaf_phys_t is modified, zap_leaf_byteswap() must be modified. + */ +typedef struct zap_leaf_phys { + struct zap_leaf_header { + uint64_t lh_block_type; /* ZBT_LEAF */ + uint64_t lh_pad1; + uint64_t lh_prefix; /* hash prefix of this leaf */ + uint32_t lh_magic; /* ZAP_LEAF_MAGIC */ + uint16_t lh_nfree; /* number free chunks */ + uint16_t lh_nentries; /* number of entries */ + uint16_t lh_prefix_len; /* num bits used to id this */ + + /* above is accessable to zap, below is zap_leaf private */ + + uint16_t lh_freelist; /* chunk head of free list */ + uint8_t lh_pad2[12]; + } l_hdr; /* 2 24-byte chunks */ + + /* + * The header is followed by a hash table with + * ZAP_LEAF_HASH_NUMENTRIES(zap) entries. The hash table is + * followed by an array of ZAP_LEAF_NUMCHUNKS(zap) + * zap_leaf_chunk structures. These structures are accessed + * with the ZAP_LEAF_CHUNK() macro. + */ + + uint16_t l_hash[1]; +} zap_leaf_phys_t; + +typedef union zap_leaf_chunk { + struct zap_leaf_entry { + uint8_t le_type; /* always ZAP_CHUNK_ENTRY */ + uint8_t le_int_size; /* size of ints */ + uint16_t le_next; /* next entry in hash chain */ + uint16_t le_name_chunk; /* first chunk of the name */ + uint16_t le_name_length; /* bytes in name, incl null */ + uint16_t le_value_chunk; /* first chunk of the value */ + uint16_t le_value_length; /* value length in ints */ + uint32_t le_cd; /* collision differentiator */ + uint64_t le_hash; /* hash value of the name */ + } l_entry; + struct zap_leaf_array { + uint8_t la_type; /* always ZAP_CHUNK_ARRAY */ + union { + uint8_t la_array[ZAP_LEAF_ARRAY_BYTES]; + uint64_t la_array64; + } __attribute__ ((packed)); + uint16_t la_next; /* next blk or CHAIN_END */ + } l_array; + struct zap_leaf_free { + uint8_t lf_type; /* always ZAP_CHUNK_FREE */ + uint8_t lf_pad[ZAP_LEAF_ARRAY_BYTES]; + uint16_t lf_next; /* next in free list, or CHAIN_END */ + } l_free; +} zap_leaf_chunk_t; + +#endif /* _SYS_ZAP_LEAF_H */ diff --git a/include/zfs/zfs.h b/include/zfs/zfs.h new file mode 100644 index 0000000..b6d41c0 --- /dev/null +++ b/include/zfs/zfs.h @@ -0,0 +1,122 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + /* + * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef GRUB_ZFS_HEADER +#define GRUB_ZFS_HEADER 1 + + +/* + * On-disk version number. + */ +#define SPA_VERSION 28ULL + +/* + * The following are configuration names used in the nvlist describing a pool's + * configuration. + */ +#define ZPOOL_CONFIG_VERSION "version" +#define ZPOOL_CONFIG_POOL_NAME "name" +#define ZPOOL_CONFIG_POOL_STATE "state" +#define ZPOOL_CONFIG_POOL_TXG "txg" +#define ZPOOL_CONFIG_POOL_GUID "pool_guid" +#define ZPOOL_CONFIG_CREATE_TXG "create_txg" +#define ZPOOL_CONFIG_TOP_GUID "top_guid" +#define ZPOOL_CONFIG_VDEV_TREE "vdev_tree" +#define ZPOOL_CONFIG_TYPE "type" +#define ZPOOL_CONFIG_CHILDREN "children" +#define ZPOOL_CONFIG_ID "id" +#define ZPOOL_CONFIG_GUID "guid" +#define ZPOOL_CONFIG_PATH "path" +#define ZPOOL_CONFIG_DEVID "devid" +#define ZPOOL_CONFIG_METASLAB_ARRAY "metaslab_array" +#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift" +#define ZPOOL_CONFIG_ASHIFT "ashift" +#define ZPOOL_CONFIG_ASIZE "asize" +#define ZPOOL_CONFIG_DTL "DTL" +#define ZPOOL_CONFIG_STATS "stats" +#define ZPOOL_CONFIG_WHOLE_DISK "whole_disk" +#define ZPOOL_CONFIG_ERRCOUNT "error_count" +#define ZPOOL_CONFIG_NOT_PRESENT "not_present" +#define ZPOOL_CONFIG_SPARES "spares" +#define ZPOOL_CONFIG_IS_SPARE "is_spare" +#define ZPOOL_CONFIG_NPARITY "nparity" +#define ZPOOL_CONFIG_PHYS_PATH "phys_path" +#define ZPOOL_CONFIG_L2CACHE "l2cache" +#define ZPOOL_CONFIG_HOLE_ARRAY "hole_array" +#define ZPOOL_CONFIG_VDEV_CHILDREN "vdev_children" +#define ZPOOL_CONFIG_IS_HOLE "is_hole" +#define ZPOOL_CONFIG_DDT_HISTOGRAM "ddt_histogram" +#define ZPOOL_CONFIG_DDT_OBJ_STATS "ddt_object_stats" +#define ZPOOL_CONFIG_DDT_STATS "ddt_stats" +/* + * The persistent vdev state is stored as separate values rather than a single + * 'vdev_state' entry. This is because a device can be in multiple states, such + * as offline and degraded. + */ +#define ZPOOL_CONFIG_OFFLINE "offline" +#define ZPOOL_CONFIG_FAULTED "faulted" +#define ZPOOL_CONFIG_DEGRADED "degraded" +#define ZPOOL_CONFIG_REMOVED "removed" + +#define VDEV_TYPE_ROOT "root" +#define VDEV_TYPE_MIRROR "mirror" +#define VDEV_TYPE_REPLACING "replacing" +#define VDEV_TYPE_RAIDZ "raidz" +#define VDEV_TYPE_DISK "disk" +#define VDEV_TYPE_FILE "file" +#define VDEV_TYPE_MISSING "missing" +#define VDEV_TYPE_HOLE "hole" +#define VDEV_TYPE_SPARE "spare" +#define VDEV_TYPE_L2CACHE "l2cache" + +/* + * pool state. The following states are written to disk as part of the normal + * SPA lifecycle: ACTIVE, EXPORTED, DESTROYED, SPARE, L2CACHE. The remaining + * states are software abstractions used at various levels to communicate pool + * state. + */ +typedef enum pool_state { + POOL_STATE_ACTIVE = 0, /* In active use */ + POOL_STATE_EXPORTED, /* Explicitly exported */ + POOL_STATE_DESTROYED, /* Explicitly destroyed */ + POOL_STATE_SPARE, /* Reserved for hot spare use */ + POOL_STATE_L2CACHE, /* Level 2 ARC device */ + POOL_STATE_UNINITIALIZED, /* Internal spa_t state */ + POOL_STATE_UNAVAIL, /* Internal libzfs state */ + POOL_STATE_POTENTIALLY_ACTIVE /* Internal libzfs state */ +} pool_state_t; + +struct grub_zfs_data; + +int grub_zfs_fetch_nvlist(device_t dev, char **nvlist); +int grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj); + +char *grub_zfs_nvlist_lookup_string(char *nvlist, char *name); +char *grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name); +int grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, + uint64_t *out); +char *grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index); +int grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name); + +#endif /* ! GRUB_ZFS_HEADER */ diff --git a/include/zfs/zfs_acl.h b/include/zfs/zfs_acl.h new file mode 100644 index 0000000..66749af --- /dev/null +++ b/include/zfs/zfs_acl.h @@ -0,0 +1,55 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ACL_H +#define _SYS_FS_ZFS_ACL_H + +typedef struct zfs_oldace { + uint32_t z_fuid; /* "who" */ + uint32_t z_access_mask; /* access mask */ + uint16_t z_flags; /* flags, i.e inheritance */ + uint16_t z_type; /* type of entry allow/deny */ +} zfs_oldace_t; + +#define ACE_SLOT_CNT 6 + +typedef struct zfs_znode_acl_v0 { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_count; /* Number of ACEs */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_pad; /* pad */ + zfs_oldace_t z_ace_data[ACE_SLOT_CNT]; /* 6 standard ACEs */ +} zfs_znode_acl_v0_t; + +#define ZFS_ACE_SPACE (sizeof(zfs_oldace_t) * ACE_SLOT_CNT) + +typedef struct zfs_znode_acl { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_size; /* Number of bytes in ACL */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_count; /* ace count */ + uint8_t z_ace_data[ZFS_ACE_SPACE]; /* space for embedded ACEs */ +} zfs_znode_acl_t; + + +#endif /* _SYS_FS_ZFS_ACL_H */ diff --git a/include/zfs/zfs_znode.h b/include/zfs/zfs_znode.h new file mode 100644 index 0000000..e3265e3 --- /dev/null +++ b/include/zfs/zfs_znode.h @@ -0,0 +1,70 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ZNODE_H +#define _SYS_FS_ZFS_ZNODE_H + +#include <zfs/zfs_acl.h> + +#define MASTER_NODE_OBJ 1 +#define ZFS_ROOT_OBJ "ROOT" +#define ZPL_VERSION_STR "VERSION" +#define ZFS_SA_ATTRS "SA_ATTRS" + +#define ZPL_VERSION 5ULL + +#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48) + +/* + * This is the persistent portion of the znode. It is stored + * in the "bonus buffer" of the file. Short symbolic links + * are also stored in the bonus buffer. + */ +typedef struct znode_phys { + uint64_t zp_atime[2]; /* 0 - last file access time */ + uint64_t zp_mtime[2]; /* 16 - last file modification time */ + uint64_t zp_ctime[2]; /* 32 - last file change time */ + uint64_t zp_crtime[2]; /* 48 - creation time */ + uint64_t zp_gen; /* 64 - generation (txg of creation) */ + uint64_t zp_mode; /* 72 - file mode bits */ + uint64_t zp_size; /* 80 - size of file */ + uint64_t zp_parent; /* 88 - directory parent (`..') */ + uint64_t zp_links; /* 96 - number of links to file */ + uint64_t zp_xattr; /* 104 - DMU object for xattrs */ + uint64_t zp_rdev; /* 112 - dev_t for VBLK & VCHR files */ + uint64_t zp_flags; /* 120 - persistent flags */ + uint64_t zp_uid; /* 128 - file owner */ + uint64_t zp_gid; /* 136 - owning group */ + uint64_t zp_pad[4]; /* 144 - future */ + zfs_znode_acl_t zp_acl; /* 176 - 263 ACL */ + /* + * Data may pad out any remaining bytes in the znode buffer, eg: + * + * |<---------------------- dnode_phys (512) ------------------------>| + * |<-- dnode (192) --->|<----------- "bonus" buffer (320) ---------->| + * |<---- znode (264) ---->|<---- data (56) ---->| + * + * At present, we only use this space to store symbolic links. + */ +} znode_phys_t; + +#endif /* _SYS_FS_ZFS_ZNODE_H */ diff --git a/include/zfs/zil.h b/include/zfs/zil.h new file mode 100644 index 0000000..bc9d5e9 --- /dev/null +++ b/include/zfs/zil.h @@ -0,0 +1,56 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIL_H +#define _SYS_ZIL_H + +/* + * Intent log format: + * + * Each objset has its own intent log. The log header (zil_header_t) + * for objset N's intent log is kept in the Nth object of the SPA's + * intent_log objset. The log header points to a chain of log blocks, + * each of which contains log records (i.e., transactions) followed by + * a log block trailer (zil_trailer_t). The format of a log record + * depends on the record (or transaction) type, but all records begin + * with a common structure that defines the type, length, and txg. + */ + +/* + * Intent log header - this on disk structure holds fields to manage + * the log. All fields are 64 bit to easily handle cross architectures. + */ +typedef struct zil_header { + uint64_t zh_claim_txg; /* txg in which log blocks were claimed */ + uint64_t zh_replay_seq; /* highest replayed sequence number */ + blkptr_t zh_log; /* log chain */ + uint64_t zh_claim_seq; /* highest claimed sequence number */ + uint64_t zh_flags; /* header flags */ + uint64_t zh_pad[4]; +} zil_header_t; + +/* + * zh_flags bit settings + */ +#define ZIL_REPLAY_NEEDED 0x1 /* replay needed - internal only */ + +#endif /* _SYS_ZIL_H */ diff --git a/include/zfs/zio.h b/include/zfs/zio.h new file mode 100644 index 0000000..38f90d5 --- /dev/null +++ b/include/zfs/zio.h @@ -0,0 +1,92 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _ZIO_H +#define _ZIO_H + +#include <zfs/spa.h> + +#define ZEC_MAGIC 0x210da7ab10c7a11ULL /* zio data bloc tail */ + +typedef struct zio_eck { + uint64_t zec_magic; /* for validation, endianness */ + zio_cksum_t zec_cksum; /* 256-bit checksum */ +} zio_eck_t; + +/* + * Gang block headers are self-checksumming and contain an array + * of block pointers. + */ +#define SPA_GANGBLOCKSIZE SPA_MINBLOCKSIZE +#define SPA_GBH_NBLKPTRS ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t)) / sizeof(blkptr_t)) +#define SPA_GBH_FILLER ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t) - \ + (SPA_GBH_NBLKPTRS * sizeof(blkptr_t))) /\ + sizeof(uint64_t)) + +#define ZIO_GET_IOSIZE(zio) \ + (BP_IS_GANG((zio)->io_bp) ? \ + SPA_GANGBLOCKSIZE : BP_GET_PSIZE((zio)->io_bp)) + +typedef struct zio_gbh { + blkptr_t zg_blkptr[SPA_GBH_NBLKPTRS]; + uint64_t zg_filler[SPA_GBH_FILLER]; + zio_eck_t zg_tail; +} zio_gbh_phys_t; + +enum zio_checksum { + ZIO_CHECKSUM_INHERIT = 0, + ZIO_CHECKSUM_ON, + ZIO_CHECKSUM_OFF, + ZIO_CHECKSUM_LABEL, + ZIO_CHECKSUM_GANG_HEADER, + ZIO_CHECKSUM_ZILOG, + ZIO_CHECKSUM_FLETCHER_2, + ZIO_CHECKSUM_FLETCHER_4, + ZIO_CHECKSUM_SHA256, + ZIO_CHECKSUM_ZILOG2, + ZIO_CHECKSUM_FUNCTIONS +}; + +#define ZIO_CHECKSUM_ON_VALUE ZIO_CHECKSUM_FLETCHER_2 +#define ZIO_CHECKSUM_DEFAULT ZIO_CHECKSUM_ON + +enum zio_compress { + ZIO_COMPRESS_INHERIT = 0, + ZIO_COMPRESS_ON, + ZIO_COMPRESS_OFF, + ZIO_COMPRESS_LZJB, + ZIO_COMPRESS_EMPTY, + ZIO_COMPRESS_GZIP1, + ZIO_COMPRESS_GZIP2, + ZIO_COMPRESS_GZIP3, + ZIO_COMPRESS_GZIP4, + ZIO_COMPRESS_GZIP5, + ZIO_COMPRESS_GZIP6, + ZIO_COMPRESS_GZIP7, + ZIO_COMPRESS_GZIP8, + ZIO_COMPRESS_GZIP9, + ZIO_COMPRESS_FUNCTIONS +}; + +#endif /* _ZIO_H */ diff --git a/include/zfs/zio_checksum.h b/include/zfs/zio_checksum.h new file mode 100644 index 0000000..8ade44a --- /dev/null +++ b/include/zfs/zio_checksum.h @@ -0,0 +1,49 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIO_CHECKSUM_H +#define _SYS_ZIO_CHECKSUM_H + +/* + * Signature for checksum functions. + */ +typedef void zio_checksum_t(const void *data, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp); + +/* + * Information about each checksum function. + */ +typedef struct zio_checksum_info { + zio_checksum_t *ci_func; /* checksum function for each byteorder */ + int ci_correctable; /* number of correctable bits */ + int ci_eck; /* uses zio embedded checksum? */ + char *ci_name; /* descriptive name */ +} zio_checksum_info_t; + +extern void zio_checksum_SHA256(const void *, uint64_t, + grub_zfs_endian_t endian, zio_cksum_t *); +extern void fletcher_2(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); +extern void fletcher_4(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); + +#endif /* _SYS_ZIO_CHECKSUM_H */ diff --git a/include/zfs_common.h b/include/zfs_common.h new file mode 100644 index 0000000..969dbf5 --- /dev/null +++ b/include/zfs_common.h @@ -0,0 +1,94 @@ +/* + * ZFS filesystem implementation in Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __ZFS_COMMON__ +#define __ZFS_COMMON__ + +#define SECTOR_SIZE 0x200 +#define SECTOR_BITS 9 + +#define grub_le_to_cpu16 le16_to_cpu +#define grub_be_to_cpu16 be16_to_cpu +#define grub_le_to_cpu32 le32_to_cpu +#define grub_be_to_cpu32 be32_to_cpu +#define grub_le_to_cpu64 le64_to_cpu +#define grub_be_to_cpu64 be64_to_cpu + +#define grub_cpu_to_le64 cpu_to_le64 +#define grub_cpu_to_be64 cpu_to_be64 + +enum zfs_errors { + ZFS_ERR_NONE = 0, + ZFS_ERR_NOT_IMPLEMENTED_YET = -1, + ZFS_ERR_BAD_FS = -2, + ZFS_ERR_OUT_OF_MEMORY = -3, + ZFS_ERR_FILE_NOT_FOUND = -4, + ZFS_ERR_BAD_FILE_TYPE = -5, + ZFS_ERR_OUT_OF_RANGE = -6, +}; + +struct zfs_filesystem { + + /* Block Device Descriptor */ + block_dev_desc_t *dev_desc; +}; + + +extern block_dev_desc_t *zfs_dev_desc; + +struct device_s { + uint64_t part_length; +}; +typedef struct device_s *device_t; + +struct zfs_file { + device_t device; + uint64_t size; + void *data; + uint64_t offset; +}; + +typedef struct zfs_file *zfs_file_t; + +struct zfs_dirhook_info { + int dir; + int mtimeset; + time_t mtime; + time_t mtime2; +}; + + + + +struct zfs_filesystem *zfsget_fs(void); +int init_fs(block_dev_desc_t *dev_desc); +void deinit_fs(block_dev_desc_t *dev_desc); +int zfs_open(zfs_file_t, const char *filename); +uint64_t zfs_read(zfs_file_t, char *buf, uint64_t len); +struct grub_zfs_data *zfs_mount(device_t); +int zfs_close(zfs_file_t); +int zfs_ls(device_t dev, const char *path, + int (*hook) (const char *, const struct zfs_dirhook_info *)); +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf); +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part); +void zfs_unmount(struct grub_zfs_data *data); +int lzjb_decompress(void *, void *, uint32_t, uint32_t); +#endif

Hi Jorgen,
On Thu, May 24, 2012 at 6:42 AM, Jorgen Lundman lundman@lundman.net wrote:
U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Released by Sun for GRUB under the license:  *  This program is free software; you can redistribute it and/or modify  *  it under the terms of the GNU General Public License as published by  *  the Free Software Foundation; either version 2 of the License, or  *  (at your option) any later version.
GRUB official releases include ZFS in version: ftp://alpha.gnu.org/gnu/grub/grub-1.99~rc1.tar.gz
And patched against GRUB Bazaar repository for ashift fixes (4KB HDDs) more conveniently found at github: https://github.com/pendor/grub-zfs/commit/e7b6ef3ac3b9685ac4c394c897b1d4221b...
Signed-off-by: Jorgen Lundman lundman@lundman.net
v3: * add missing patch revision history (this text) Â Â * Submitted as single patch per Wolfgang Denk instructions
v2: * Keep Makefile placement alphabetically sorted.   * Clean ugly line breaks and indentation errors   * Fix license corruption in fs/Makefile
Makefile           |   2 +-  common/Makefile        |   1 +  common/cmd_zfs.c       |  236 +++++  fs/Makefile          |   3 +-  fs/{ => zfs}/Makefile     |  39 +-  fs/zfs/dev.c         |  137 +++  fs/zfs/zfs.c         | 2396 ++++++++++++++++++++++++++++++++++++++++++  fs/zfs/zfs_fletcher.c     |  84 ++  fs/zfs/zfs_lzjb.c       |  94 ++  fs/zfs/zfs_sha256.c      |  145 +++  include/config_cmd_all.h   |   1 +  include/zfs/dmu.h       |  119 +++  include/zfs/dmu_objset.h   |  43 +  include/zfs/dnode.h      |  80 ++  include/zfs/dsl_dataset.h   |  52 +  include/zfs/dsl_dir.h     |  48 +  include/zfs/sa_impl.h     |  34 +  include/zfs/spa.h       |  311 ++++++  include/zfs/uberblock_impl.h |  57 +  include/zfs/vdev_impl.h    |  69 ++  include/zfs/zap_impl.h    |  112 ++  include/zfs/zap_leaf.h    |  103 ++  include/zfs/zfs.h       |  122 +++  include/zfs/zfs_acl.h     |  55 +  include/zfs/zfs_znode.h    |  70 ++  include/zfs/zil.h       |  56 +  include/zfs/zio.h       |  92 ++  include/zfs/zio_checksum.h  |  49 +  include/zfs_common.h     |  94 ++  29 files changed, 4687 insertions(+), 17 deletions(-)  create mode 100644 common/cmd_zfs.c  copy fs/{ => zfs}/Makefile (56%)  create mode 100644 fs/zfs/dev.c  create mode 100644 fs/zfs/zfs.c  create mode 100644 fs/zfs/zfs_fletcher.c  create mode 100644 fs/zfs/zfs_lzjb.c  create mode 100644 fs/zfs/zfs_sha256.c  create mode 100644 include/zfs/dmu.h  create mode 100644 include/zfs/dmu_objset.h  create mode 100644 include/zfs/dnode.h  create mode 100644 include/zfs/dsl_dataset.h  create mode 100644 include/zfs/dsl_dir.h  create mode 100644 include/zfs/sa_impl.h  create mode 100644 include/zfs/spa.h  create mode 100644 include/zfs/uberblock_impl.h  create mode 100644 include/zfs/vdev_impl.h  create mode 100644 include/zfs/zap_impl.h  create mode 100644 include/zfs/zap_leaf.h  create mode 100644 include/zfs/zfs.h  create mode 100644 include/zfs/zfs_acl.h  create mode 100644 include/zfs/zfs_znode.h  create mode 100644 include/zfs/zil.h  create mode 100644 include/zfs/zio.h  create mode 100644 include/zfs/zio_checksum.h  create mode 100644 include/zfs_common.h
.. [snip] ..
A README entry in the doc folder would be very much helpful for the users to use ZFS..
Thx, --Prabhakar Lad http://in.linkedin.com/pub/prabhakar-lad/19/92b/955
+int lzjb_decompress(void *, void *, uint32_t, uint32_t);
+#endif
1.7.0.4
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

Patch to add ZFS filesystem support to u-boot, based on GRUB sources. Thank you for your patience.
Jorgen Lundman (1): zfs: Add ZFS filesystem support
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ doc/README.zfs | 30 + fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 30 files changed, 4717 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c create mode 100644 doc/README.zfs copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h

U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Released by Sun for GRUB under the license: * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version.
GRUB official releases include ZFS in version: ftp://alpha.gnu.org/gnu/grub/grub-1.99~rc1.tar.gz
And patched against GRUB Bazaar repository for ashift fixes (4KB HDDs) more conveniently found at github: https://github.com/pendor/grub-zfs/commit/e7b6ef3ac3b9685ac4c394c897b1d4221b...
Signed-off-by: Jorgen Lundman lundman@lundman.net
---
v4: * Add doc/README.zfs documentation
v3: * add missing patch revision history (this text) * Submitted as single patch per Wolfgang Denk instructions
v2: * Keep Makefile placement alphabetically sorted. * Clean ugly line breaks and indentation errors * Fix license corruption in fs/Makefile --- Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ doc/README.zfs | 30 + fs/Makefile | 3 +- fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 84 ++ fs/zfs/zfs_lzjb.c | 94 ++ fs/zfs/zfs_sha256.c | 145 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 119 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 80 ++ include/zfs/dsl_dataset.h | 52 + include/zfs/dsl_dir.h | 48 + include/zfs/sa_impl.h | 34 + include/zfs/spa.h | 311 ++++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 69 ++ include/zfs/zap_impl.h | 112 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 70 ++ include/zfs/zil.h | 56 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 49 + include/zfs_common.h | 94 ++ 30 files changed, 4717 insertions(+), 17 deletions(-) create mode 100644 common/cmd_zfs.c create mode 100644 doc/README.zfs copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 351a8f0..d3b84bf 100644 --- a/Makefile +++ b/Makefile @@ -244,7 +244,7 @@ endif LIBS += arch/$(ARCH)/lib/lib$(ARCH).o LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \ fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \ - fs/ubifs/libubifs.o + fs/ubifs/libubifs.o fs/zfs/libzfs.o LIBS += net/libnet.o LIBS += disk/libdisk.o LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 6e23baa..4de03da 100644 --- a/common/Makefile +++ b/common/Makefile @@ -164,6 +164,7 @@ COBJS-$(CONFIG_USB_STORAGE) += usb_storage.o endif COBJS-$(CONFIG_CMD_XIMG) += cmd_ximg.o COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o
# others diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..a6ea2c0 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,236 @@ +/* + * + * ZFS filesystem porting to Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + * + */ + +#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h> + +#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif + +#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif + +#define DOS_PART_MAGIC_OFFSET 0x1fe +#define DOS_FS_TYPE_OFFSET 0x36 +#define DOS_FS32_TYPE_OFFSET 0x52 + +static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + char *filename = NULL; + char *ep; + int dev; + unsigned long part = 1; + ulong addr = 0; + ulong part_length; + disk_partition_t info; + char buf[12]; + unsigned long count; + const char *addr_str; + struct zfs_file zfile; + struct device_s vdev; + + if (argc < 3) + return CMD_RET_USAGE; + + count = 0; + addr = simple_strtoul(argv[3], NULL, 16); + filename = getenv("bootfile"); + switch (argc) { + case 3: + addr_str = getenv("loadaddr"); + if (addr_str != NULL) + addr = simple_strtoul(addr_str, NULL, 16); + else + addr = CONFIG_SYS_LOAD_ADDR; + + break; + case 4: + break; + case 5: + filename = argv[4]; + break; + case 6: + filename = argv[4]; + count = simple_strtoul(argv[5], NULL, 16); + break; + + default: + return cmd_usage(cmdtp); + } + + if (!filename) { + puts("** No boot file defined **\n"); + return 1; + } + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + if (zfs_dev_desc == NULL) { + printf("** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (part != 0) { + if (get_partition_info(zfs_dev_desc, part, &info)) { + printf("** Bad partition %lu **\n", part); + return 1; + } + + if (strncmp((char *)info.type, BOOT_PART_TYPE, + strlen(BOOT_PART_TYPE)) != 0) { + printf("** Invalid partition type "%s" (expect "" BOOT_PART_TYPE "")\n", + info.type); + return 1; + } + printf("Loading file "%s" " + "from %s device %d:%lu %s\n", + filename, argv[1], dev, part, info.name); + } else { + printf("Loading file "%s" from %s device %d\n", + filename, argv[1], dev); + } + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("**Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + memset(&zfile, 0, sizeof(zfile)); + zfile.device = &vdev; + if (zfs_open(&zfile, filename)) { + printf("** File not found %s\n", filename); + return 1; + } + + if ((count < zfile.size) && (count != 0)) + zfile.size = (uint64_t)count; + + if (zfs_read(&zfile, (char *)addr, zfile.size) != zfile.size) { + printf("** Unable to read "%s" from %s %d:%lu **\n", + filename, argv[1], dev, part); + zfs_close(&zfile); + return 1; + } + + zfs_close(&zfile); + + /* Loading ok, update default load address */ + load_addr = addr; + + printf("%llu bytes read\n", zfile.size); + sprintf(buf, "%llX", zfile.size); + setenv("filesize", buf); + + return 0; +} + + +int zfs_print(const char *entry, const struct zfs_dirhook_info *data) +{ + printf("%s %s\n", + data->dir ? "<DIR> " : " ", + entry); + return 0; /* 0 continue, 1 stop */ +} + + + +static int do_zfs_ls(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + const char *filename = "/"; + int dev; + unsigned long part = 1; + char *ep; + int part_length; + struct device_s vdev; + + if (argc < 3) + return cmd_usage(cmdtp); + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + + if (zfs_dev_desc == NULL) { + printf("\n** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("\n** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (argc == 4) + filename = argv[3]; + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("** Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + zfs_ls(&vdev, filename, + zfs_print); + + return 0; +} + + +U_BOOT_CMD(zfsls, 4, 1, do_zfs_ls, + "list files in a directory (default /)", + "<interface> <dev[:part]> [directory]\n" + " - list files from 'dev' on 'interface' in a '/DATASET/@/$dir/'"); + +U_BOOT_CMD(zfsload, 6, 0, do_zfs_load, + "load binary file from a ZFS filesystem", + "<interface> <dev[:part]> [addr] [filename] [bytes]\n" + " - load binary file '/DATASET/@/$dir/$file' from 'dev' on 'interface'\n" + " to address 'addr' from ZFS filesystem"); diff --git a/doc/README.zfs b/doc/README.zfs new file mode 100644 index 0000000..4b0e8a5 --- /dev/null +++ b/doc/README.zfs @@ -0,0 +1,30 @@ +This patch series adds support for ZFS listing and load to u-boot. + +To Enable zfs ls and load commands, modify the board specific config file with +#define CONFIG_CMD_ZFS + +Steps to test: + +1. After applying the patch, zfs specific commands can be seen + in the boot loader prompt using + UBOOT #help + + zfsload- load binary file from a ZFS file system + zfsls - list files in a directory (default /) + +2. To list the files in zfs pool, device or partition, execute + zfsls <interface> <dev[:part]> [POOL/@/dir/file] + For example: + UBOOT #zfsls mmc 0:5 /rpool/@/usr/bin/ + +3. To read and load a file from an ZFS formatted partition to RAM, execute + zfsload <interface> <dev[:part]> [addr] [filename] [bytes] + For example: + UBOOT #zfsload mmc 2:2 0x30007fc0 /rpool/@/boot/uImage + +References : + -- ZFS GRUB sources from Solaris GRUB-0.97 + -- GRUB Bazaar repository + +Jorgen Lundman <lundman at lundman.net> 2012. + diff --git a/fs/Makefile b/fs/Makefile index 22aad12..baa7e96 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -1,6 +1,6 @@ # # (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# Wolfgang Denk, DENX Software Engineering, <wd at denx.de> # # See file CREDITS for list of people who contributed to this # project. @@ -30,6 +30,7 @@ subdirs-$(CONFIG_CMD_JFFS2) += jffs2 subdirs-$(CONFIG_CMD_REISER) += reiserfs subdirs-$(CONFIG_YAFFS2) += yaffs2 subdirs-$(CONFIG_CMD_UBIFS) += ubifs +subdirs-$(CONFIG_CMD_ZFS) += zfs
SUBDIRS := $(subdirs-y)
diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 56% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..938fc5e 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,6 @@ # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2012 +# Jorgen Lundman <lundman at lundman.net> # # See file CREDITS for list of people who contributed to this # project. @@ -20,19 +20,28 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, # MA 02111-1307 USA # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk + +LIB = $(obj)libzfs.o + +AOBJS = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o + +SRCS := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS := $(addprefix $(obj),$(AOBJS) $(COBJS-y)) + + +all: $(LIB) $(AOBJS) + +$(LIB): $(obj).depend $(OBJS) + $(call cmd_link_o_target, $(OBJS)) + +######################################################################### + +# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all: - @for dir in $(SUBDIRS) ; do \ - $(MAKE) -C $$dir $@ ; done +######################################################################### diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..ab32865 --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,137 @@ +/* + * + * based on code of fs/reiserfs/dev.c by + * + * (C) Copyright 2003 - 2004 + * Sysgo AG, <www.elinos.com>, Pavel Bartusek pba@sysgo.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <config.h> +#include <zfs_common.h> + +static block_dev_desc_t *zfs_block_dev_desc; +static disk_partition_t part_info; + +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{ + zfs_block_dev_desc = rbdd; + + if (part == 0) { + /* disk doesn't use partition table */ + part_info.start = 0; + part_info.size = rbdd->lba; + part_info.blksz = rbdd->blksz; + } else { + if (get_partition_info(zfs_block_dev_desc, part, &part_info)) + return 0; + } + + return part_info.size; +} + +/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{ + short sec_buffer[SECTOR_SIZE/sizeof(short)]; + char *sec_buf = sec_buffer; + unsigned block_len; + + /* + * Check partition boundaries + */ + if ((sector < 0) || + ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >= + part_info.size)) { + /* errnum = ERR_OUTSIDE_PART; */ + printf(" ** zfs_devread() read outside partition sector %d\n", sector); + return 1; + } + + /* + * Get the read to the beginning of a partition. + */ + sector += byte_offset >> SECTOR_BITS; + byte_offset &= SECTOR_SIZE - 1; + + debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len); + + if (zfs_block_dev_desc == NULL) { + printf("** Invalid Block Device Descriptor (NULL)\n"); + return 1; + } + + if (byte_offset != 0) { + /* read first part which isn't aligned with start of sector */ + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error **\n"); + return 1; + } + memcpy(buf, sec_buf + byte_offset, + min(SECTOR_SIZE - byte_offset, byte_len)); + buf += min(SECTOR_SIZE - byte_offset, byte_len); + byte_len -= min(SECTOR_SIZE - byte_offset, byte_len); + sector++; + } + + if (byte_len == 0) + return 0; + + /* read sector aligned part */ + block_len = byte_len & ~(SECTOR_SIZE - 1); + + if (block_len == 0) { + u8 p[SECTOR_SIZE]; + + block_len = SECTOR_SIZE; + zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + 1, (unsigned long *)p); + memcpy(buf, p, byte_len); + return 0; + } + + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + block_len / SECTOR_SIZE, + (unsigned long *) buf) != + block_len / SECTOR_SIZE) { + printf(" ** zfs_devread() read error - block\n"); + return 1; + } + + block_len = byte_len & ~(SECTOR_SIZE - 1); + buf += block_len; + byte_len -= block_len; + sector += block_len / SECTOR_SIZE; + + if (byte_len != 0) { + /* read rest of data which are not in whole sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error - last part\n"); + return 1; + } + memcpy(buf, sec_buf, byte_len); + } + return 0; +} diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..d6e0e23 --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2396 @@ +/* + * + * ZFS filesystem ported to u-boot by + * Jorgen Lundman <lundman at lundman.net> + * + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * Copyright 2004 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + * + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +block_dev_desc_t *zfs_dev_desc; + +/* + * The zfs plug-in routines for GRUB are: + * + * zfs_mount() - locates a valid uberblock of the root pool and reads + * in its MOS at the memory address MOS. + * + * zfs_open() - locates a plain file object by following the MOS + * and places its dnode at the memory address DNODE. + * + * zfs_read() - read in the data blocks pointed by the DNODE. + * + */ + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/sa_impl.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + + +#define ZPOOL_PROP_BOOTFS "bootfs" + + +/* + * For nvlist manipulation. (from nvpair.h) + */ +#define NV_ENCODE_NATIVE 0 +#define NV_ENCODE_XDR 1 +#define NV_BIG_ENDIAN 0 +#define NV_LITTLE_ENDIAN 1 +#define DATA_TYPE_UINT64 8 +#define DATA_TYPE_STRING 9 +#define DATA_TYPE_NVLIST 19 +#define DATA_TYPE_NVLIST_ARRAY 20 + + +/* + * Macros to get fields in a bp or DVA. + */ +#define P2PHASE(x, align) ((x) & ((align) - 1)) +#define DVA_OFFSET_TO_PHYS_SECTOR(offset) \ + ((offset + VDEV_LABEL_START_SIZE) >> SPA_MINBLOCKSHIFT) + +/* + * return x rounded down to an align boundary + * eg, P2ALIGN(1200, 1024) == 1024 (1*align) + * eg, P2ALIGN(1024, 1024) == 1024 (1*align) + * eg, P2ALIGN(0x1234, 0x100) == 0x1200 (0x12*align) + * eg, P2ALIGN(0x5600, 0x100) == 0x5600 (0x56*align) + */ +#define P2ALIGN(x, align) ((x) & -(align)) + +/* + * FAT ZAP data structures + */ +#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ +#define ZAP_HASH_IDX(hash, n) (((n) == 0) ? 0 : ((hash) >> (64 - (n)))) +#define CHAIN_END 0xffff /* end of the chunk chain */ + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +#define ZAP_LEAF_HASH_SHIFT(bs) (bs - 5) +#define ZAP_LEAF_HASH_NUMENTRIES(bs) (1 << ZAP_LEAF_HASH_SHIFT(bs)) +#define LEAF_HASH(bs, h) \ + ((ZAP_LEAF_HASH_NUMENTRIES(bs)-1) & \ + ((h) >> (64 - ZAP_LEAF_HASH_SHIFT(bs)-l->l_hdr.lh_prefix_len))) + +/* + * The amount of space available for chunks is: + * block size shift - hash entry size (2) * number of hash + * entries - header space (2*chunksize) + */ +#define ZAP_LEAF_NUMCHUNKS(bs) \ + (((1<<bs) - 2*ZAP_LEAF_HASH_NUMENTRIES(bs)) / \ + ZAP_LEAF_CHUNKSIZE - 2) + +/* + * The chunks start immediately after the hash table. The end of the + * hash table is at l_hash + HASH_NUMENTRIES, which we simply cast to a + * chunk_t. + */ +#define ZAP_LEAF_CHUNK(l, bs, idx) \ + ((zap_leaf_chunk_t *)(l->l_hash + ZAP_LEAF_HASH_NUMENTRIES(bs)))[idx] +#define ZAP_LEAF_ENTRY(l, bs, idx) (&ZAP_LEAF_CHUNK(l, bs, idx).l_entry) + + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + + + +typedef int zfs_decomp_func_t(void *s_start, void *d_start, + uint32_t s_len, uint32_t d_len); +typedef struct decomp_entry { + char *name; + zfs_decomp_func_t *decomp_func; +} decomp_entry_t; + +typedef struct dnode_end { + dnode_phys_t dn; + grub_zfs_endian_t endian; +} dnode_end_t; + +struct grub_zfs_data { + /* cache for a file block of the currently zfs_open()-ed file */ + char *file_buf; + uint64_t file_start; + uint64_t file_end; + + /* XXX: ashift is per vdev, not per pool. We currently only ever touch + * a single vdev, but when/if raid-z or stripes are supported, this + * may need revision. + */ + uint64_t vdev_ashift; + uint64_t label_txg; + uint64_t pool_guid; + + /* cache for a dnode block */ + dnode_phys_t *dnode_buf; + dnode_phys_t *dnode_mdn; + uint64_t dnode_start; + uint64_t dnode_end; + grub_zfs_endian_t dnode_endian; + + uberblock_t current_uberblock; + + dnode_end_t mos; + dnode_end_t mdn; + dnode_end_t dnode; + + uint64_t vdev_phys_sector; + + int (*userhook)(const char *, const struct zfs_dirhook_info *); + struct zfs_dirhook_info *dirinfo; + +}; + + + + +static int +zlib_decompress(void *s, void *d, + uint32_t slen, uint32_t dlen) +{ + if (zlib_decompress(s, d, slen, dlen) < 0) + return ZFS_ERR_BAD_FS; + return ZFS_ERR_NONE; +} + +static decomp_entry_t decomp_table[ZIO_COMPRESS_FUNCTIONS] = { + {"inherit", NULL}, /* ZIO_COMPRESS_INHERIT */ + {"on", lzjb_decompress}, /* ZIO_COMPRESS_ON */ + {"off", NULL}, /* ZIO_COMPRESS_OFF */ + {"lzjb", lzjb_decompress}, /* ZIO_COMPRESS_LZJB */ + {"empty", NULL}, /* ZIO_COMPRESS_EMPTY */ + {"gzip-1", zlib_decompress}, /* ZIO_COMPRESS_GZIP1 */ + {"gzip-2", zlib_decompress}, /* ZIO_COMPRESS_GZIP2 */ + {"gzip-3", zlib_decompress}, /* ZIO_COMPRESS_GZIP3 */ + {"gzip-4", zlib_decompress}, /* ZIO_COMPRESS_GZIP4 */ + {"gzip-5", zlib_decompress}, /* ZIO_COMPRESS_GZIP5 */ + {"gzip-6", zlib_decompress}, /* ZIO_COMPRESS_GZIP6 */ + {"gzip-7", zlib_decompress}, /* ZIO_COMPRESS_GZIP7 */ + {"gzip-8", zlib_decompress}, /* ZIO_COMPRESS_GZIP8 */ + {"gzip-9", zlib_decompress}, /* ZIO_COMPRESS_GZIP9 */ +}; + + + +static int zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, + void *buf, struct grub_zfs_data *data); + +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data); + +/* + * Our own version of log2(). Same thing as highbit()-1. + */ +static int +zfs_log2(uint64_t num) +{ + int i = 0; + + while (num > 1) { + i++; + num = num >> 1; + } + + return i; +} + + +/* Checksum Functions */ +static void +zio_checksum_off(const void *buf __attribute__ ((unused)), + uint64_t size __attribute__ ((unused)), + grub_zfs_endian_t endian __attribute__ ((unused)), + zio_cksum_t *zcp) +{ + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +/* Checksum Table and Values */ +static zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = { + {NULL, 0, 0, "inherit"}, + {NULL, 0, 0, "on"}, + {zio_checksum_off, 0, 0, "off"}, + {zio_checksum_SHA256, 1, 1, "label"}, + {zio_checksum_SHA256, 1, 1, "gang_header"}, + {NULL, 0, 0, "zilog"}, + {fletcher_2, 0, 0, "fletcher2"}, + {fletcher_4, 1, 0, "fletcher4"}, + {zio_checksum_SHA256, 1, 0, "SHA256"}, + {NULL, 0, 0, "zilog2"}, +}; + +/* + * zio_checksum_verify: Provides support for checksum verification. + * + * Fletcher2, Fletcher4, and SHA256 are supported. + * + */ +static int +zio_checksum_verify(zio_cksum_t zc, uint32_t checksum, + grub_zfs_endian_t endian, char *buf, int size) +{ + zio_eck_t *zec = (zio_eck_t *) (buf + size) - 1; + zio_checksum_info_t *ci = &zio_checksum_table[checksum]; + zio_cksum_t actual_cksum, expected_cksum; + + if (checksum >= ZIO_CHECKSUM_FUNCTIONS || ci->ci_func == NULL) { + printf("zfs unknown checksum function %d\n", checksum); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (ci->ci_eck) { + expected_cksum = zec->zec_cksum; + zec->zec_cksum = zc; + ci->ci_func(buf, size, endian, &actual_cksum); + zec->zec_cksum = expected_cksum; + zc = expected_cksum; + } else { + ci->ci_func(buf, size, endian, &actual_cksum); + } + + if ((actual_cksum.zc_word[0] != zc.zc_word[0]) + || (actual_cksum.zc_word[1] != zc.zc_word[1]) + || (actual_cksum.zc_word[2] != zc.zc_word[2]) + || (actual_cksum.zc_word[3] != zc.zc_word[3])) { + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * vdev_uberblock_compare takes two uberblock structures and returns an integer + * indicating the more recent of the two. + * Return Value = 1 if ub2 is more recent + * Return Value = -1 if ub1 is more recent + * The most recent uberblock is determined using its transaction number and + * timestamp. The uberblock with the highest transaction number is + * considered "newer". If the transaction numbers of the two blocks match, the + * timestamps are compared to determine the "newer" of the two. + */ +static int +vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +{ + grub_zfs_endian_t ub1_endian, ub2_endian; + if (grub_zfs_to_cpu64(ub1->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub1_endian = LITTLE_ENDIAN; + else + ub1_endian = BIG_ENDIAN; + if (grub_zfs_to_cpu64(ub2->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub2_endian = LITTLE_ENDIAN; + else + ub2_endian = BIG_ENDIAN; + + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_txg, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return 1; + + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + < grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return -1; + if (grub_zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + > grub_zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return 1; + + return 0; +} + +/* + * Three pieces of information are needed to verify an uberblock: the magic + * number, the version number, and the checksum. + * + * Currently Implemented: version number, magic number, label txg + * Need to Implement: checksum + * + */ +static int +uberblock_verify(uberblock_t *uber, int offset, struct grub_zfs_data *data) +{ + int err; + grub_zfs_endian_t endian = UNKNOWN_ENDIAN; + zio_cksum_t zc; + + if (uber->ub_txg < data->label_txg) { + debug("ignoring partially written label: uber_txg < label_txg %llu %llu\n", + uber->ub_txg, data->label_txg); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(uber->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) <= SPA_VERSION) + endian = LITTLE_ENDIAN; + + if (grub_zfs_to_cpu64(uber->ub_magic, BIG_ENDIAN) == UBERBLOCK_MAGIC + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) > 0 + && grub_zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) <= SPA_VERSION) + endian = BIG_ENDIAN; + + if (endian == UNKNOWN_ENDIAN) { + printf("invalid uberblock magic\n"); + return ZFS_ERR_BAD_FS; + } + + memset(&zc, 0, sizeof(zc)); + zc.zc_word[0] = grub_cpu_to_zfs64(offset, endian); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_LABEL, endian, + (char *) uber, UBERBLOCK_SIZE(data->vdev_ashift)); + + if (!err) { + /* Check that the data pointed by the rootbp is usable. */ + void *osp = NULL; + size_t ospsize; + err = zio_read(&uber->ub_rootbp, endian, &osp, &ospsize, data); + free(osp); + + if (!err && ospsize < OBJSET_PHYS_SIZE_V14) { + printf("uberblock rootbp points to invalid data\n"); + return ZFS_ERR_BAD_FS; + } + } + + return err; +} + +/* + * Find the best uberblock. + * Return: + * Success - Pointer to the best uberblock. + * Failure - NULL + */ +static uberblock_t *find_bestub(char *ub_array, struct grub_zfs_data *data) +{ + const uint64_t sector = data->vdev_phys_sector; + uberblock_t *ubbest = NULL; + uberblock_t *ubnext; + unsigned int i, offset, pickedub = 0; + int err = ZFS_ERR_NONE; + + const unsigned int UBCOUNT = UBERBLOCK_COUNT(data->vdev_ashift); + const uint64_t UBBYTES = UBERBLOCK_SIZE(data->vdev_ashift); + + for (i = 0; i < UBCOUNT; i++) { + ubnext = (uberblock_t *) (i * UBBYTES + ub_array); + offset = (sector << SPA_MINBLOCKSHIFT) + VDEV_PHYS_SIZE + (i * UBBYTES); + + err = uberblock_verify(ubnext, offset, data); + if (err) + continue; + + if (ubbest == NULL || vdev_uberblock_compare(ubnext, ubbest) > 0) { + ubbest = ubnext; + pickedub = i; + } + } + + if (ubbest) + debug("zfs Found best uberblock at idx %d, txg %llu\n", + pickedub, (unsigned long long) ubbest->ub_txg); + + return ubbest; +} + +static inline size_t +get_psize(blkptr_t *bp, grub_zfs_endian_t endian) +{ + return (((grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 16) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT; +} + +static uint64_t +dva_get_offset(dva_t *dva, grub_zfs_endian_t endian) +{ + return grub_zfs_to_cpu64((dva)->dva_word[1], + endian) << SPA_MINBLOCKSHIFT; +} + +/* + * Read a block of data based on the gang block address dva, + * and put its data in buf. + * + */ +static int +zio_read_gang(blkptr_t *bp, grub_zfs_endian_t endian, dva_t *dva, void *buf, + struct grub_zfs_data *data) +{ + zio_gbh_phys_t *zio_gb; + uint64_t offset, sector; + unsigned i; + int err; + zio_cksum_t zc; + + memset(&zc, 0, sizeof(zc)); + + zio_gb = malloc(SPA_GANGBLOCKSIZE); + if (!zio_gb) + return ZFS_ERR_OUT_OF_MEMORY; + + offset = dva_get_offset(dva, endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + /* read in the gang block header */ + err = zfs_devread(sector, 0, SPA_GANGBLOCKSIZE, (char *) zio_gb); + + if (err) { + free(zio_gb); + return err; + } + + /* XXX */ + /* self checksuming the gang block header */ + ZIO_SET_CHECKSUM(&zc, DVA_GET_VDEV(dva), + dva_get_offset(dva, endian), bp->blk_birth, 0); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_GANG_HEADER, endian, + (char *) zio_gb, SPA_GANGBLOCKSIZE); + if (err) { + free(zio_gb); + return err; + } + + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + + for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { + if (zio_gb->zg_blkptr[i].blk_birth == 0) + continue; + + err = zio_read_data(&zio_gb->zg_blkptr[i], endian, buf, data); + if (err) { + free(zio_gb); + return err; + } + buf = (char *) buf + get_psize(&zio_gb->zg_blkptr[i], endian); + } + free(zio_gb); + return ZFS_ERR_NONE; +} + +/* + * Read in a block of raw data to buf. + */ +static int +zio_read_data(blkptr_t *bp, grub_zfs_endian_t endian, void *buf, + struct grub_zfs_data *data) +{ + int i, psize; + int err = ZFS_ERR_NONE; + + psize = get_psize(bp, endian); + + /* pick a good dva from the block pointer */ + for (i = 0; i < SPA_DVAS_PER_BP; i++) { + uint64_t offset, sector; + + if (bp->blk_dva[i].dva_word[0] == 0 && bp->blk_dva[i].dva_word[1] == 0) + continue; + + if ((grub_zfs_to_cpu64(bp->blk_dva[i].dva_word[1], endian)>>63) & 1) { + err = zio_read_gang(bp, endian, &bp->blk_dva[i], buf, data); + } else { + /* read in a data block */ + offset = dva_get_offset(&bp->blk_dva[i], endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + err = zfs_devread(sector, 0, psize, buf); + } + + if (!err) { + /*Check the underlying checksum before we rule this DVA as "good"*/ + uint32_t checkalgo = (grub_zfs_to_cpu64((bp)->blk_prop, endian) >> 40) & 0xff; + + err = zio_checksum_verify(bp->blk_cksum, checkalgo, endian, buf, psize); + if (!err) + return ZFS_ERR_NONE; + } + + /* If read failed or checksum bad, reset the error. Hopefully we've got some more DVA's to try.*/ + } + + if (!err) { + printf("couldn't find a valid DVA\n"); + err = ZFS_ERR_BAD_FS; + } + + return err; +} + +/* + * Read in a block of data, verify its checksum, decompress if needed, + * and put the uncompressed data in buf. + */ +static int +zio_read(blkptr_t *bp, grub_zfs_endian_t endian, void **buf, + size_t *size, struct grub_zfs_data *data) +{ + size_t lsize, psize; + unsigned int comp; + char *compbuf = NULL; + int err; + + *buf = NULL; + + comp = (grub_zfs_to_cpu64((bp)->blk_prop, endian)>>32) & 0xff; + lsize = (BP_IS_HOLE(bp) ? 0 : + (((grub_zfs_to_cpu64((bp)->blk_prop, endian) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT)); + psize = get_psize(bp, endian); + + if (size) + *size = lsize; + + if (comp >= ZIO_COMPRESS_FUNCTIONS) { + printf("compression algorithm %u not supported\n", (unsigned int) comp); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF && decomp_table[comp].decomp_func == NULL) { + printf("compression algorithm %s not supported\n", decomp_table[comp].name); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF) { + compbuf = malloc(psize); + if (!compbuf) + return ZFS_ERR_OUT_OF_MEMORY; + } else { + compbuf = *buf = malloc(lsize); + } + + err = zio_read_data(bp, endian, compbuf, data); + if (err) { + free(compbuf); + *buf = NULL; + return err; + } + + if (comp != ZIO_COMPRESS_OFF) { + *buf = malloc(lsize); + if (!*buf) { + free(compbuf); + return ZFS_ERR_OUT_OF_MEMORY; + } + + err = decomp_table[comp].decomp_func(compbuf, *buf, psize, lsize); + free(compbuf); + if (err) { + free(*buf); + *buf = NULL; + return err; + } + } + + return ZFS_ERR_NONE; +} + +/* + * Get the block from a block id. + * push the block onto the stack. + * + */ +static int +dmu_read(dnode_end_t *dn, uint64_t blkid, void **buf, + grub_zfs_endian_t *endian_out, struct grub_zfs_data *data) +{ + int idx, level; + blkptr_t *bp_array = dn->dn.dn_blkptr; + int epbs = dn->dn.dn_indblkshift - SPA_BLKPTRSHIFT; + blkptr_t *bp; + void *tmpbuf = 0; + grub_zfs_endian_t endian; + int err = ZFS_ERR_NONE; + + bp = malloc(sizeof(blkptr_t)); + if (!bp) + return ZFS_ERR_OUT_OF_MEMORY; + + endian = dn->endian; + for (level = dn->dn.dn_nlevels - 1; level >= 0; level--) { + idx = (blkid >> (epbs * level)) & ((1 << epbs) - 1); + *bp = bp_array[idx]; + if (bp_array != dn->dn.dn_blkptr) { + free(bp_array); + bp_array = 0; + } + + if (BP_IS_HOLE(bp)) { + size_t size = grub_zfs_to_cpu16(dn->dn.dn_datablkszsec, + dn->endian) + << SPA_MINBLOCKSHIFT; + *buf = malloc(size); + if (*buf) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + memset(*buf, 0, size); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + if (level == 0) { + err = zio_read(bp, endian, buf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + err = zio_read(bp, endian, &tmpbuf, 0, data); + endian = (grub_zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + if (err) + break; + bp_array = tmpbuf; + } + if (bp_array != dn->dn.dn_blkptr) + free(bp_array); + if (endian_out) + *endian_out = endian; + + free(bp); + return err; +} + +/* + * mzap_lookup: Looks up property described by "name" and returns the value + * in "value". + */ +static int +mzap_lookup(mzap_phys_t *zapobj, grub_zfs_endian_t endian, + int objsize, char *name, uint64_t * value) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (strcmp(mzap_ent[i].mze_name, name) == 0) { + *value = grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian); + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + +static int +mzap_iterate(mzap_phys_t *zapobj, grub_zfs_endian_t endian, int objsize, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (hook(mzap_ent[i].mze_name, + grub_zfs_to_cpu64(mzap_ent[i].mze_value, endian), + data)) + return 1; + } + + return 0; +} + +static uint64_t +zap_hash(uint64_t salt, const char *name) +{ + static uint64_t table[256]; + const uint8_t *cp; + uint8_t c; + uint64_t crc = salt; + + if (table[128] == 0) { + uint64_t *ct; + int i, j; + for (i = 0; i < 256; i++) { + for (ct = table + i, *ct = i, j = 8; j > 0; j--) + *ct = (*ct >> 1) ^ (-(*ct & 1) & ZFS_CRC64_POLY); + } + } + + for (cp = (const uint8_t *) name; (c = *cp) != '\0'; cp++) + crc = (crc >> 8) ^ table[(crc ^ c) & 0xFF]; + + /* + * Only use 28 bits, since we need 4 bits in the cookie for the + * collision differentiator. We MUST use the high bits, since + * those are the onces that we first pay attention to when + * chosing the bucket. + */ + crc &= ~((1ULL << (64 - ZAP_HASHBITS)) - 1); + + return crc; +} + +/* + * Only to be used on 8-bit arrays. + * array_len is actual len in bytes (not encoded le_value_length). + * buf is null-terminated. + */ +/* XXX */ +static int +zap_leaf_array_equal(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, int chunk, int array_len, const char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + return 0; + + if (memcmp(la->la_array, buf + bseen, toread) != 0) + break; + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return (bseen == array_len); +} + +/* XXX */ +static int +zap_leaf_array_get(zap_leaf_phys_t *l, grub_zfs_endian_t endian, int blksft, + int chunk, int array_len, char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + /* Don't use errno because this error is to be ignored. */ + return ZFS_ERR_BAD_FS; + + memcpy(buf + bseen, la->la_array, toread); + chunk = grub_zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return ZFS_ERR_NONE; +} + + +/* + * Given a zap_leaf_phys_t, walk thru the zap leaf chunks to get the + * value for the property "name". + * + */ +/* XXX */ +static int +zap_leaf_lookup(zap_leaf_phys_t *l, grub_zfs_endian_t endian, + int blksft, uint64_t h, + const char *name, uint64_t *value) +{ + uint16_t chunk; + struct zap_leaf_entry *le; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + printf("invalid leaf type\n"); + return ZFS_ERR_BAD_FS; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + printf("invalid leaf magic\n"); + return ZFS_ERR_BAD_FS; + } + + for (chunk = grub_zfs_to_cpu16(l->l_hash[LEAF_HASH(blksft, h)], endian); + chunk != CHAIN_END; chunk = le->le_next) { + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) { + printf("invalid chunk number\n"); + return ZFS_ERR_BAD_FS; + } + + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) { + printf("invalid chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + + if (grub_zfs_to_cpu64(le->le_hash, endian) != h) + continue; + + if (zap_leaf_array_equal(l, endian, blksft, + grub_zfs_to_cpu16(le->le_name_chunk, endian), + grub_zfs_to_cpu16(le->le_name_length, endian), + name)) { + struct zap_leaf_array *la; + + if (le->le_int_size != 8 || le->le_value_length != 1) { + printf("invalid leaf chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + + *value = grub_be_to_cpu64(la->la_array64); + + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + + +/* Verify if this is a fat zap header block */ +static int +zap_verify(zap_phys_t *zap) +{ + if (zap->zap_magic != (uint64_t) ZAP_MAGIC) { + printf("bad ZAP magic\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_flags != 0) { + printf("bad ZAP flags\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_salt == 0) { + printf("bad ZAP salt\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Fat ZAP lookup + * + */ +/* XXX */ +static int +fzap_lookup(dnode_end_t *zap_dnode, zap_phys_t *zap, + char *name, uint64_t *value, struct grub_zfs_data *data) +{ + void *l; + uint64_t hash, idx, blkid; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t leafendian; + + err = zap_verify(zap); + if (err) + return err; + + hash = zap_hash(zap->zap_salt, name); + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + idx = ZAP_HASH_IDX(hash, zap->zap_ptrtbl.zt_shift); + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return ZFS_ERR_BAD_FS; + } + err = dmu_read(zap_dnode, blkid, &l, &leafendian, data); + if (err) + return err; + + err = zap_leaf_lookup(l, leafendian, blksft, hash, name, value); + free(l); + return err; +} + +/* XXX */ +static int +fzap_iterate(dnode_end_t *zap_dnode, zap_phys_t *zap, + int (*hook)(const char *name, + uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + zap_leaf_phys_t *l; + void *l_in; + uint64_t idx, blkid; + uint16_t chunk; + int blksft = zfs_log2(grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + grub_zfs_endian_t endian; + + if (zap_verify(zap)) + return 0; + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return 0; + } + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return 0; + } + for (idx = 0; idx < zap->zap_ptrtbl.zt_numblks; idx++) { + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + err = dmu_read(zap_dnode, blkid, &l_in, &endian, data); + l = l_in; + if (err) + continue; + + /* Verify if this is a valid leaf block */ + if (grub_zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + free(l); + continue; + } + if (grub_zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + free(l); + continue; + } + + for (chunk = 0; chunk < ZAP_LEAF_NUMCHUNKS(blksft); chunk++) { + char *buf; + struct zap_leaf_array *la; + struct zap_leaf_entry *le; + uint64_t val; + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) + continue; + + buf = malloc(grub_zfs_to_cpu16(le->le_name_length, endian) + + 1); + if (zap_leaf_array_get(l, endian, blksft, le->le_name_chunk, + le->le_name_length, buf)) { + free(buf); + continue; + } + buf[le->le_name_length] = 0; + + if (le->le_int_size != 8 + || grub_zfs_to_cpu16(le->le_value_length, endian) != 1) + continue; + + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + val = grub_be_to_cpu64(la->la_array64); + if (hook(buf, val, data)) + return 1; + free(buf); + } + } + return 0; +} + + +/* + * Read in the data of a zap object and find the value for a matching + * property name. + * + */ +static int +zap_lookup(dnode_end_t *zap_dnode, char *name, uint64_t *val, + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return err; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + err = (mzap_lookup(zapbuf, endian, size, name, val)); + free(zapbuf); + return err; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + err = (fzap_lookup(zap_dnode, zapbuf, name, val, data)); + free(zapbuf); + return err; + } + + printf("unknown ZAP type\n"); + return ZFS_ERR_BAD_FS; +} + +static int +zap_iterate(dnode_end_t *zap_dnode, + int (*hook)(const char *name, uint64_t val, + struct grub_zfs_data *data), + struct grub_zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + int ret; + grub_zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = grub_zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return 0; + block_type = grub_zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + ret = mzap_iterate(zapbuf, endian, size, hook, data); + free(zapbuf); + return ret; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + ret = fzap_iterate(zap_dnode, zapbuf, hook, data); + free(zapbuf); + return ret; + } + printf("unknown ZAP type\n"); + return 0; +} + + +/* + * Get the dnode of an object number from the metadnode of an object set. + * + * Input + * mdn - metadnode to get the object dnode + * objnum - object number for the object dnode + * buf - data buffer that holds the returning dnode + */ +static int +dnode_get(dnode_end_t *mdn, uint64_t objnum, uint8_t type, + dnode_end_t *buf, struct grub_zfs_data *data) +{ + uint64_t blkid, blksz; /* the block id this object dnode is in */ + int epbs; /* shift of number of dnodes in a block */ + int idx; /* index within a block */ + void *dnbuf; + int err; + grub_zfs_endian_t endian; + + blksz = grub_zfs_to_cpu16(mdn->dn.dn_datablkszsec, + mdn->endian) << SPA_MINBLOCKSHIFT; + + epbs = zfs_log2(blksz) - DNODE_SHIFT; + blkid = objnum >> epbs; + idx = objnum & ((1 << epbs) - 1); + + if (data->dnode_buf != NULL && memcmp(data->dnode_mdn, mdn, + sizeof(*mdn)) == 0 + && objnum >= data->dnode_start && objnum < data->dnode_end) { + memmove(&(buf->dn), &(data->dnode_buf)[idx], DNODE_SIZE); + buf->endian = data->dnode_endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type: %02X != %02x\n", buf->dn.dn_type, type); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; + } + + err = dmu_read(mdn, blkid, &dnbuf, &endian, data); + if (err) + return err; + + free(data->dnode_buf); + free(data->dnode_mdn); + data->dnode_mdn = malloc(sizeof(*mdn)); + if (!data->dnode_mdn) { + data->dnode_buf = 0; + } else { + memcpy(data->dnode_mdn, mdn, sizeof(*mdn)); + data->dnode_buf = dnbuf; + data->dnode_start = blkid << epbs; + data->dnode_end = (blkid + 1) << epbs; + data->dnode_endian = endian; + } + + memmove(&(buf->dn), (dnode_phys_t *) dnbuf + idx, DNODE_SIZE); + buf->endian = endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Get the file dnode for a given file name where mdn is the meta dnode + * for this ZFS object set. When found, place the file dnode in dn. + * The 'path' argument will be mangled. + * + */ +static int +dnode_get_path(dnode_end_t *mdn, const char *path_in, dnode_end_t *dn, + struct grub_zfs_data *data) +{ + uint64_t objnum, version; + char *cname, ch; + int err = ZFS_ERR_NONE; + char *path, *path_buf; + struct dnode_chain { + struct dnode_chain *next; + dnode_end_t dn; + }; + struct dnode_chain *dnode_path = 0, *dn_new, *root; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) + return ZFS_ERR_OUT_OF_MEMORY; + dn_new->next = 0; + dnode_path = root = dn_new; + + err = dnode_get(mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + err = zap_lookup(&(dnode_path->dn), ZPL_VERSION_STR, &version, data); + if (err) { + free(dn_new); + return err; + } + if (version > ZPL_VERSION) { + free(dn_new); + printf("too new ZPL version\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + err = zap_lookup(&(dnode_path->dn), ZFS_ROOT_OBJ, &objnum, data); + if (err) { + free(dn_new); + return err; + } + + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + path = path_buf = strdup(path_in); + if (!path_buf) { + free(dn_new); + return ZFS_ERR_OUT_OF_MEMORY; + } + + while (1) { + /* skip leading slashes */ + while (*path == '/') + path++; + if (!*path) + break; + /* get the next component name */ + cname = path; + while (*path && *path != '/') + path++; + /* Skip dot. */ + if (cname + 1 == path && cname[0] == '.') + continue; + /* Handle double dot. */ + if (cname + 2 == path && cname[0] == '.' && cname[1] == '.') { + if (dn_new->next) { + dn_new = dnode_path; + dnode_path = dn_new->next; + free(dn_new); + } else { + printf("can't resolve ..\n"); + err = ZFS_ERR_FILE_NOT_FOUND; + break; + } + continue; + } + + ch = *path; + *path = 0; /* ensure null termination */ + + if (dnode_path->dn.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + free(path_buf); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + err = zap_lookup(&(dnode_path->dn), cname, &objnum, data); + if (err) + break; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + dn_new->next = dnode_path; + dnode_path = dn_new; + + objnum = ZFS_DIRENT_OBJ(objnum); + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) + break; + + *path = ch; + } + + if (!err) + memcpy(dn, &(dnode_path->dn), sizeof(*dn)); + + while (dnode_path) { + dn_new = dnode_path->next; + free(dnode_path); + dnode_path = dn_new; + } + free(path_buf); + return err; +} + + +/* + * Given a MOS metadnode, get the metadnode of a given filesystem name (fsname), + * e.g. pool/rootfs, or a given object number (obj), e.g. the object number + * of pool/rootfs. + * + * If no fsname and no obj are given, return the DSL_DIR metadnode. + * If fsname is given, return its metadnode and its matching object number. + * If only obj is given, return the metadnode for this object number. + * + */ +static int +get_filesystem_dnode(dnode_end_t *mosmdn, char *fsname, + dnode_end_t *mdn, struct grub_zfs_data *data) +{ + uint64_t objnum; + int err; + + err = dnode_get(mosmdn, DMU_POOL_DIRECTORY_OBJECT, + DMU_OT_OBJECT_DIRECTORY, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, DMU_POOL_ROOT_DATASET, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + while (*fsname) { + uint64_t childobj; + char *cname, ch; + + while (*fsname == '/') + fsname++; + + if (!*fsname || *fsname == '@') + break; + + cname = fsname; + while (*fsname && !isspace(*fsname) && *fsname != '/') + fsname++; + ch = *fsname; + *fsname = 0; + + childobj = grub_zfs_to_cpu64((((dsl_dir_phys_t *) DN_BONUS(&mdn->dn)))->dd_child_dir_zapobj, mdn->endian); + err = dnode_get(mosmdn, childobj, + DMU_OT_DSL_DIR_CHILD_MAP, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, cname, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + *fsname = ch; + } + return ZFS_ERR_NONE; +} + +static int +make_mdn(dnode_end_t *mdn, struct grub_zfs_data *data) +{ + void *osp; + blkptr_t *bp; + size_t ospsize; + int err; + + bp = &(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_bp); + err = zio_read(bp, mdn->endian, &osp, &ospsize, data); + if (err) + return err; + if (ospsize < OBJSET_PHYS_SIZE_V14) { + free(osp); + printf("too small osp\n"); + return ZFS_ERR_BAD_FS; + } + + mdn->endian = (grub_zfs_to_cpu64(bp->blk_prop, mdn->endian)>>63) & 1; + memmove((char *) &(mdn->dn), + (char *) &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + free(osp); + return ZFS_ERR_NONE; +} + +static int +dnode_get_fullpath(const char *fullpath, dnode_end_t *mdn, + uint64_t *mdnobj, dnode_end_t *dn, int *isfs, + struct grub_zfs_data *data) +{ + char *fsname, *snapname; + const char *ptr_at, *filename; + uint64_t headobj; + int err; + + ptr_at = strchr(fullpath, '@'); + if (!ptr_at) { + *isfs = 1; + filename = 0; + snapname = 0; + fsname = strdup(fullpath); + } else { + const char *ptr_slash = strchr(ptr_at, '/'); + + *isfs = 0; + fsname = malloc(ptr_at - fullpath + 1); + if (!fsname) + return ZFS_ERR_OUT_OF_MEMORY; + memcpy(fsname, fullpath, ptr_at - fullpath); + fsname[ptr_at - fullpath] = 0; + if (ptr_at[1] && ptr_at[1] != '/') { + snapname = malloc(ptr_slash - ptr_at); + if (!snapname) { + free(fsname); + return ZFS_ERR_OUT_OF_MEMORY; + } + memcpy(snapname, ptr_at + 1, ptr_slash - ptr_at - 1); + snapname[ptr_slash - ptr_at - 1] = 0; + } else { + snapname = 0; + } + if (ptr_slash) + filename = ptr_slash; + else + filename = "/"; + printf("zfs fsname = '%s' snapname='%s' filename = '%s'\n", + fsname, snapname, filename); + } + + + err = get_filesystem_dnode(&(data->mos), fsname, dn, data); + + if (err) { + free(fsname); + free(snapname); + return err; + } + + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&dn->dn))->dd_head_dataset_obj, dn->endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + + if (snapname) { + uint64_t snapobj; + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_snapnames_zapobj, mdn->endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, mdn, data); + if (!err) + err = zap_lookup(mdn, snapname, &headobj, data); + if (!err) + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + } + + if (mdnobj) + *mdnobj = headobj; + + make_mdn(mdn, data); + + if (*isfs) { + free(fsname); + free(snapname); + return ZFS_ERR_NONE; + } + err = dnode_get_path(mdn, filename, dn, data); + free(fsname); + free(snapname); + return err; +} + +/* + * For a given XDR packed nvlist, verify the first 4 bytes and move on. + * + * An XDR packed nvlist is encoded as (comments from nvs_xdr_create) : + * + * encoding method/host endian (4 bytes) + * nvl_version (4 bytes) + * nvl_nvflag (4 bytes) + * encoded nvpairs: + * encoded size of the nvpair (4 bytes) + * decoded size of the nvpair (4 bytes) + * name string size (4 bytes) + * name string data (sizeof(NV_ALIGN4(string)) + * data type (4 bytes) + * # of elements in the nvpair (4 bytes) + * data + * 2 zero's for the last nvpair + * (end of the entire list) (8 bytes) + * + */ + +static int +nvlist_find_value(char *nvlist, char *name, int valtype, char **val, + size_t *size_out, size_t *nelm_out) +{ + int name_len, type, encode_size; + char *nvpair, *nvp_name; + + /* Verify if the 1st and 2nd byte in the nvlist are valid. */ + /* NOTE: independently of what endianness header announces all + subsequent values are big-endian. */ + if (nvlist[0] != NV_ENCODE_XDR || (nvlist[1] != NV_LITTLE_ENDIAN + && nvlist[1] != NV_BIG_ENDIAN)) { + printf("zfs incorrect nvlist header\n"); + return ZFS_ERR_BAD_FS; + } + + /* skip the header, nvl_version, and nvl_nvflag */ + nvlist = nvlist + 4 * 3; + /* + * Loop thru the nvpair list + * The XDR representation of an integer is in big-endian byte order. + */ + while ((encode_size = grub_be_to_cpu32(*(uint32_t *) nvlist))) { + int nelm; + + nvpair = nvlist + 4 * 2; /* skip the encode/decode size */ + + name_len = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nvp_name = nvpair; + nvpair = nvpair + ((name_len + 3) & ~3); /* align */ + + type = grub_be_to_cpu32(*(uint32_t *) nvpair); + nvpair += 4; + + nelm = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (nelm < 1) { + printf("empty nvpair\n"); + return ZFS_ERR_BAD_FS; + } + + nvpair += 4; + + if ((strncmp(nvp_name, name, name_len) == 0) && type == valtype) { + *val = nvpair; + *size_out = encode_size; + if (nelm_out) + *nelm_out = nelm; + return 1; + } + + nvlist += encode_size; /* goto the next nvpair */ + } + return 0; +} + +int +grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out) +{ + char *nvpair; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0); + if (!found) + return 0; + if (size < sizeof(uint64_t)) { + printf("invalid uint64\n"); + return ZFS_ERR_BAD_FS; + } + + *out = grub_be_to_cpu64(*(uint64_t *) nvpair); + return 1; +} + +char * +grub_zfs_nvlist_lookup_string(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t slen; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_STRING, &nvpair, &size, 0); + if (!found) + return 0; + if (size < 4) { + printf("invalid string\n"); + return 0; + } + slen = grub_be_to_cpu32(*(uint32_t *) nvpair); + if (slen > size - 4) + slen = size - 4; + ret = malloc(slen + 1); + if (!ret) + return 0; + memcpy(ret, nvpair + 4, slen); + ret[slen] = 0; + return ret; +} + +char * +grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, 0); + if (!found) + return 0; + ret = calloc(1, size + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpair, size); + return ret; +} + +int +grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name) +{ + char *nvpair; + size_t nelm, size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return -1; + return nelm; +} + +char * +grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index) +{ + char *nvpair, *nvpairptr; + int found; + char *ret; + size_t size; + unsigned i; + size_t nelm; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return 0; + if (index >= nelm) { + printf("trying to lookup past nvlist array\n"); + return 0; + } + + nvpairptr = nvpair; + + for (i = 0; i < index; i++) { + uint32_t encode_size; + + /* skip the header, nvl_version, and nvl_nvflag */ + nvpairptr = nvpairptr + 4 * 2; + + while (nvpairptr < nvpair + size + && (encode_size = grub_be_to_cpu32(*(uint32_t *) nvpairptr))) + nvlist += encode_size; /* goto the next nvpair */ + + nvlist = nvlist + 4 * 2; /* skip the ending 2 zeros - 8 bytes */ + } + + if (nvpairptr >= nvpair + size + || nvpairptr + grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + >= nvpair + size) { + printf("incorrect nvlist array\n"); + return 0; + } + + ret = calloc(1, grub_be_to_cpu32(*(uint32_t *) (nvpairptr + 4 * 2)) + + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpairptr, size); + return ret; +} + +static int +zfs_fetch_nvlist(struct grub_zfs_data *data, char **nvlist) +{ + int err; + + *nvlist = malloc(VDEV_PHYS_SIZE); + /* Read in the vdev name-value pair list (112K). */ + err = zfs_devread(data->vdev_phys_sector, 0, VDEV_PHYS_SIZE, *nvlist); + if (err) { + free(*nvlist); + *nvlist = 0; + return err; + } + return ZFS_ERR_NONE; +} + +/* + * Check the disk label information and retrieve needed vdev name-value pairs. + * + */ +static int +check_pool_label(struct grub_zfs_data *data) +{ + uint64_t pool_state; + char *nvlist; /* for the pool */ + char *vdevnvlist; /* for the vdev */ + uint64_t diskguid; + uint64_t version; + int found; + int err; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) + return err; + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_STATE, + &pool_state); + if (!found) { + free(nvlist); + printf("zfs pool state not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (pool_state == POOL_STATE_DESTROYED) { + free(nvlist); + printf("zpool is marked as destroyed\n"); + return ZFS_ERR_BAD_FS; + } + + data->label_txg = 0; + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_TXG, + &data->label_txg); + if (!found) { + free(nvlist); + printf("zfs pool txg not found\n"); + return ZFS_ERR_BAD_FS; + } + + /* not an active device */ + if (data->label_txg == 0) { + free(nvlist); + printf("zpool is not active\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_VERSION, + &version); + if (!found) { + free(nvlist); + printf("zpool config version not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (version > SPA_VERSION) { + free(nvlist); + printf("SPA version too new %llu > %llu\n", + (unsigned long long) version, + (unsigned long long) SPA_VERSION); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + vdevnvlist = grub_zfs_nvlist_lookup_nvlist(nvlist, ZPOOL_CONFIG_VDEV_TREE); + if (!vdevnvlist) { + free(nvlist); + printf("ZFS config vdev tree not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT, + &data->vdev_ashift); + free(vdevnvlist); + if (!found) { + free(nvlist); + printf("ZPOOL config ashift not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_GUID, &diskguid); + if (!found) { + free(nvlist); + printf("ZPOOL config guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = grub_zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_GUID, &data->pool_guid); + if (!found) { + free(nvlist); + printf("ZPOOL config pool guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + free(nvlist); + + printf("ZFS Pool GUID: %llu (%016llx) Label: GUID: %llu (%016llx), txg: %llu, SPA v%llu, ashift: %llu\n", + (unsigned long long) data->pool_guid, + (unsigned long long) data->pool_guid, + (unsigned long long) diskguid, + (unsigned long long) diskguid, + (unsigned long long) data->label_txg, + (unsigned long long) version, + (unsigned long long) data->vdev_ashift); + + return ZFS_ERR_NONE; +} + +/* + * vdev_label_start returns the physical disk offset (in bytes) of + * label "l". + */ +static uint64_t vdev_label_start(uint64_t psize, int l) +{ + return (l * sizeof(vdev_label_t) + (l < VDEV_LABELS / 2 ? + 0 : psize - + VDEV_LABELS * sizeof(vdev_label_t))); +} + +void +zfs_unmount(struct grub_zfs_data *data) +{ + free(data->dnode_buf); + free(data->dnode_mdn); + free(data->file_buf); + free(data); +} + +/* + * zfs_mount() locates a valid uberblock of the root pool and read in its MOS + * to the memory address MOS. + * + */ +struct grub_zfs_data * +zfs_mount(device_t dev) +{ + struct grub_zfs_data *data = 0; + int label = 0, bestlabel = -1; + char *ub_array; + uberblock_t *ubbest; + uberblock_t *ubcur = NULL; + void *osp = 0; + size_t ospsize; + int err; + + data = malloc(sizeof(*data)); + if (!data) + return 0; + memset(data, 0, sizeof(*data)); + + ub_array = malloc(VDEV_UBERBLOCK_RING); + if (!ub_array) { + zfs_unmount(data); + return 0; + } + + ubbest = malloc(sizeof(*ubbest)); + if (!ubbest) { + zfs_unmount(data); + return 0; + } + memset(ubbest, 0, sizeof(*ubbest)); + + /* + * some eltorito stacks don't give us a size and + * we end up setting the size to MAXUINT, further + * some of these devices stop working once a single + * read past the end has been issued. Checking + * for a maximum part_length and skipping the backup + * labels at the end of the slice/partition/device + * avoids breaking down on such devices. + */ + const int vdevnum = + dev->part_length == 0 ? + VDEV_LABELS / 2 : VDEV_LABELS; + + /* Size in bytes of the device (disk or partition) aligned to label size*/ + uint64_t device_size = + dev->part_length << SECTOR_BITS; + + const uint64_t alignedbytes = + P2ALIGN(device_size, (uint64_t) sizeof(vdev_label_t)); + + for (label = 0; label < vdevnum; label++) { + uint64_t labelstartbytes = vdev_label_start(alignedbytes, label); + uint64_t labelstart = labelstartbytes >> SECTOR_BITS; + + debug("zfs reading label %d at sector %llu (byte %llu)\n", + label, (unsigned long long) labelstart, + (unsigned long long) labelstartbytes); + + data->vdev_phys_sector = labelstart + + ((VDEV_SKIP_SIZE + VDEV_BOOT_HEADER_SIZE) >> SECTOR_BITS); + + err = check_pool_label(data); + if (err) { + printf("zfs error checking label %d\n", label); + continue; + } + + /* Read in the uberblock ring (128K). */ + err = zfs_devread(data->vdev_phys_sector + + (VDEV_PHYS_SIZE >> SECTOR_BITS), + 0, VDEV_UBERBLOCK_RING, ub_array); + if (err) { + printf("zfs error reading uberblock ring for label %d\n", label); + continue; + } + + ubcur = find_bestub(ub_array, data); + if (!ubcur) { + printf("zfs No good uberblocks found in label %d\n", label); + continue; + } + + if (vdev_uberblock_compare(ubcur, ubbest) > 0) { + /* Looks like the block is good, so use it.*/ + memcpy(ubbest, ubcur, sizeof(*ubbest)); + bestlabel = label; + debug("zfs Current best uberblock found in label %d\n", label); + } + } + free(ub_array); + + /* We zero'd the structure to begin with. If we never assigned to it, + magic will still be zero. */ + if (!ubbest->ub_magic) { + printf("couldn't find a valid ZFS label\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + debug("zfs ubbest %p in label %d\n", ubbest, bestlabel); + + grub_zfs_endian_t ub_endian = + grub_zfs_to_cpu64(ubbest->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + ? LITTLE_ENDIAN : BIG_ENDIAN; + + debug("zfs endian set to %s\n", !ub_endian ? "big" : "little"); + + err = zio_read(&ubbest->ub_rootbp, ub_endian, &osp, &ospsize, data); + + if (err) { + printf("couldn't zio_read object directory\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + if (ospsize < OBJSET_PHYS_SIZE_V14) { + printf("osp too small\n"); + zfs_unmount(data); + free(osp); + free(ubbest); + return 0; + } + + /* Got the MOS. Save it at the memory addr MOS. */ + memmove(&(data->mos.dn), &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + data->mos.endian = + (grub_zfs_to_cpu64(ubbest->ub_rootbp.blk_prop, ub_endian) >> 63) & 1; + memmove(&(data->current_uberblock), ubbest, sizeof(uberblock_t)); + + free(osp); + free(ubbest); + + return data; +} + +int +grub_zfs_fetch_nvlist(device_t dev, char **nvlist) +{ + struct grub_zfs_data *zfs; + int err; + + zfs = zfs_mount(dev); + if (!zfs) + return ZFS_ERR_BAD_FS; + err = zfs_fetch_nvlist(zfs, nvlist); + zfs_unmount(zfs); + return err; +} + +static int +zfs_label(device_t device, char **label) +{ + char *nvlist; + int err; + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = zfs_fetch_nvlist(data, &nvlist); + if (err) { + zfs_unmount(data); + return err; + } + + *label = grub_zfs_nvlist_lookup_string(nvlist, ZPOOL_CONFIG_POOL_NAME); + free(nvlist); + zfs_unmount(data); + return ZFS_ERR_NONE; +} + +static int +zfs_uuid(device_t device, char **uuid) +{ + struct grub_zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + *uuid = malloc(17); /* %016llx + nil */ + if (!*uuid) + return ZFS_ERR_OUT_OF_MEMORY; + + /* *uuid = xasprintf ("%016llx", (long long unsigned) data->pool_guid);*/ + snprintf(*uuid, 17, "%016llx", (long long unsigned) data->pool_guid); + zfs_unmount(data); + + return ZFS_ERR_NONE; +} + +/* + * zfs_open() locates a file in the rootpool by following the + * MOS and places the dnode of the file in the memory address DNODE. + */ +int +zfs_open(struct zfs_file *file, const char *fsfilename) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(file->device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), 0, + &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + + if (isfs) { + zfs_unmount(data); + printf("Missing @ or / separator\n"); + return ZFS_ERR_FILE_NOT_FOUND; + } + + /* We found the dnode for this file. Verify if it is a plain file. */ + if (data->dnode.dn.dn_type != DMU_OT_PLAIN_FILE_CONTENTS) { + zfs_unmount(data); + printf("not a file\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + + /* get the file size and set the file position to 0 */ + + /* + * For DMU_OT_SA we will need to locate the SIZE attribute + * attribute, which could be either in the bonus buffer + * or the "spill" block. + */ + if (data->dnode.dn.dn_bonustype == DMU_OT_SA) { + void *sahdrp; + int hdrsize; + + if (data->dnode.dn.dn_bonuslen != 0) { + sahdrp = (sa_hdr_phys_t *) DN_BONUS(&data->dnode.dn); + } else if (data->dnode.dn.dn_flags & DNODE_FLAG_SPILL_BLKPTR) { + blkptr_t *bp = &data->dnode.dn.dn_spill; + + err = zio_read(bp, data->dnode.endian, &sahdrp, NULL, data); + if (err) + return err; + } else { + printf("filesystem is corrupt :(\n"); + return ZFS_ERR_BAD_FS; + } + + hdrsize = SA_HDR_SIZE(((sa_hdr_phys_t *) sahdrp)); + file->size = *(uint64_t *) ((char *) sahdrp + hdrsize + SA_SIZE_OFFSET); + } else { + file->size = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&data->dnode.dn))->zp_size, data->dnode.endian); + } + + file->data = data; + file->offset = 0; + + return ZFS_ERR_NONE; +} + +uint64_t +zfs_read(zfs_file_t file, char *buf, uint64_t len) +{ + struct grub_zfs_data *data = (struct grub_zfs_data *) file->data; + int blksz, movesize; + uint64_t length; + int64_t red; + int err; + + if (data->file_buf == NULL) { + data->file_buf = malloc(SPA_MAXBLOCKSIZE); + if (!data->file_buf) + return -1; + data->file_start = data->file_end = 0; + } + + /* + * If offset is in memory, move it into the buffer provided and return. + */ + if (file->offset >= data->file_start + && file->offset + len <= data->file_end) { + memmove(buf, data->file_buf + file->offset - data->file_start, + len); + return len; + } + + blksz = grub_zfs_to_cpu16(data->dnode.dn.dn_datablkszsec, + data->dnode.endian) << SPA_MINBLOCKSHIFT; + + /* + * Entire Dnode is too big to fit into the space available. We + * will need to read it in chunks. This could be optimized to + * read in as large a chunk as there is space available, but for + * now, this only reads in one data block at a time. + */ + length = len; + red = 0; + while (length) { + void *t; + /* + * Find requested blkid and the offset within that block. + */ + uint64_t blkid = (file->offset + red) / blksz; + free(data->file_buf); + data->file_buf = 0; + + err = dmu_read(&(data->dnode), blkid, &t, + 0, data); + data->file_buf = t; + if (err) + return -1; + + data->file_start = blkid * blksz; + data->file_end = data->file_start + blksz; + + movesize = MIN(length, data->file_end - (int) file->offset - red); + + memmove(buf, data->file_buf + file->offset + red + - data->file_start, movesize); + buf += movesize; + length -= movesize; + red += movesize; + } + + return len; +} + +int +zfs_close(zfs_file_t file) +{ + zfs_unmount((struct grub_zfs_data *) file->data); + return ZFS_ERR_NONE; +} + +int +grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj) +{ + struct grub_zfs_data *data; + int err; + int isfs; + + data = zfs_mount(dev); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), mdnobj, + &(data->dnode), &isfs, data); + zfs_unmount(data); + return err; +} + +static void +fill_fs_info(struct zfs_dirhook_info *info, + dnode_end_t mdn, struct grub_zfs_data *data) +{ + int err; + dnode_end_t dn; + uint64_t objnum; + uint64_t headobj; + + memset(info, 0, sizeof(*info)); + + info->dir = 1; + + if (mdn.dn.dn_type == DMU_OT_DSL_DIR) { + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&mdn.dn))->dd_head_dataset_obj, mdn.endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &mdn, data); + if (err) { + printf("zfs failed here 1\n"); + return; + } + } + make_mdn(&mdn, data); + err = dnode_get(&mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &dn, data); + if (err) { + printf("zfs failed here 2\n"); + return; + } + + err = zap_lookup(&dn, ZFS_ROOT_OBJ, &objnum, data); + if (err) { + printf("zfs failed here 3\n"); + return; + } + + err = dnode_get(&mdn, objnum, 0, &dn, data); + if (err) { + printf("zfs failed here 4\n"); + return; + } + + info->mtimeset = 1; + info->mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + + return; +} + +static int iterate_zap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t dn; + + memset(&info, 0, sizeof(info)); + + dnode_get(&(data->mdn), val, 0, &dn, data); + info.mtimeset = 1; + info.mtime = grub_zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + info.dir = (dn.dn.dn_type == DMU_OT_DIRECTORY_CONTENTS); + debug("zfs type=%d, name=%s\n", + (int)dn.dn.dn_type, (char *)name); + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_fs(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t mdn; + int err; + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + if (mdn.dn.dn_type != DMU_OT_DSL_DIR) + return 0; + + fill_fs_info(&info, mdn, data); + + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_snap(const char *name, uint64_t val, struct grub_zfs_data *data) +{ + struct zfs_dirhook_info info; + char *name2; + int ret = 0; + dnode_end_t mdn; + int err; + + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + + if (mdn.dn.dn_type != DMU_OT_DSL_DATASET) + return 0; + + fill_fs_info(&info, mdn, data); + + name2 = malloc(strlen(name) + 2); + name2[0] = '@'; + memcpy(name2 + 1, name, strlen(name) + 1); + if (data->userhook) + ret = data->userhook(name2, &info); + free(name2); + return ret; +} + +int +zfs_ls(device_t device, const char *path, + int (*hook)(const char *, const struct zfs_dirhook_info *)) +{ + struct grub_zfs_data *data; + int err; + int isfs; +#if 0 + char *label = NULL; + + zfs_label(device, &label); + if (label) + printf("ZPOOL label '%s'\n", + label); +#endif + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + data->userhook = hook; + + err = dnode_get_fullpath(path, &(data->mdn), 0, &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + if (isfs) { + uint64_t childobj, headobj; + uint64_t snapobj; + dnode_end_t dn; + struct zfs_dirhook_info info; + + fill_fs_info(&info, data->dnode, data); + hook("@", &info); + + childobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_child_dir_zapobj, data->dnode.endian); + headobj = grub_zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_head_dataset_obj, data->dnode.endian); + err = dnode_get(&(data->mos), childobj, + DMU_OT_DSL_DIR_CHILD_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + + zap_iterate(&dn, iterate_zap_fs, data); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + snapobj = grub_zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&dn.dn))->ds_snapnames_zapobj, dn.endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + zap_iterate(&dn, iterate_zap_snap, data); + } else { + if (data->dnode.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + zfs_unmount(data); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + zap_iterate(&(data->dnode), iterate_zap, data); + } + zfs_unmount(data); + return ZFS_ERR_NONE; +} + diff --git a/fs/zfs/zfs_fletcher.c b/fs/zfs/zfs_fletcher.c new file mode 100644 index 0000000..d96c6ff --- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,84 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +void +fletcher_2(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint64_t *ip = buf; + const uint64_t *ipend = ip + (size / sizeof(uint64_t)); + uint64_t a0, b0, a1, b1; + + for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 += grub_zfs_to_cpu64(ip[0], endian); + a1 += grub_zfs_to_cpu64(ip[1], endian); + b0 += a0; + b1 += a1; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a0, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(a1, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(b0, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(b1, endian); +} + +void +fletcher_4(const void *buf, uint64_t size, grub_zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint32_t *ip = buf; + const uint32_t *ipend = ip + (size / sizeof(uint32_t)); + uint64_t a, b, c, d; + + for (a = b = c = d = 0; ip < ipend; ip++) { + a += grub_zfs_to_cpu32(ip[0], endian); + b += a; + c += b; + d += c; + } + + zcp->zc_word[0] = grub_cpu_to_zfs64(a, endian); + zcp->zc_word[1] = grub_cpu_to_zfs64(b, endian); + zcp->zc_word[2] = grub_cpu_to_zfs64(c, endian); + zcp->zc_word[3] = grub_cpu_to_zfs64(d, endian); +} + diff --git a/fs/zfs/zfs_lzjb.c b/fs/zfs/zfs_lzjb.c new file mode 100644 index 0000000..33e9b90 --- /dev/null +++ b/fs/zfs/zfs_lzjb.c @@ -0,0 +1,94 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +#define MATCH_BITS 6 +#define MATCH_MIN 3 +#define OFFSET_MASK ((1 << (16 - MATCH_BITS)) - 1) + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + +int +lzjb_decompress(void *s_start, void *d_start, uint32_t s_len, + uint32_t d_len) +{ + uint8_t *src = s_start; + uint8_t *dst = d_start; + uint8_t *d_end = (uint8_t *) d_start + d_len; + uint8_t *s_end = (uint8_t *) s_start + s_len; + uint8_t *cpy, copymap = 0; + int copymask = 1 << (NBBY - 1); + + while (dst < d_end && src < s_end) { + if ((copymask <<= 1) == (1 << NBBY)) { + copymask = 1; + copymap = *src++; + } + if (src >= s_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + if (copymap & copymask) { + int mlen = (src[0] >> (NBBY - MATCH_BITS)) + MATCH_MIN; + int offset = ((src[0] << NBBY) | src[1]) & OFFSET_MASK; + src += 2; + cpy = dst - offset; + if (src > s_end || cpy < (uint8_t *) d_start) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + while (--mlen >= 0 && dst < d_end) + *dst++ = *cpy++; + } else { + *dst++ = *src++; + } + } + if (dst < d_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; +} diff --git a/fs/zfs/zfs_sha256.c b/fs/zfs/zfs_sha256.c new file mode 100644 index 0000000..7a9439a --- /dev/null +++ b/fs/zfs/zfs_sha256.c @@ -0,0 +1,145 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2007 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +/* + * SHA-256 checksum, as specified in FIPS 180-2, available at: + * http://csrc.nist.gov/cryptval + * + * This is a very compact implementation of SHA-256. + * It is designed to be simple and portable, not to be fast. + */ + +/* + * The literal definitions according to FIPS180-2 would be: + * + * Ch(x, y, z) (((x) & (y)) ^ ((~(x)) & (z))) + * Maj(x, y, z) (((x) & (y)) | ((x) & (z)) | ((y) & (z))) + * + * We use logical equivalents which require one less op. + */ +#define Ch(x, y, z) ((z) ^ ((x) & ((y) ^ (z)))) +#define Maj(x, y, z) (((x) & (y)) ^ ((z) & ((x) ^ (y)))) +#define Rot32(x, s) (((x) >> s) | ((x) << (32 - s))) +#define SIGMA0(x) (Rot32(x, 2) ^ Rot32(x, 13) ^ Rot32(x, 22)) +#define SIGMA1(x) (Rot32(x, 6) ^ Rot32(x, 11) ^ Rot32(x, 25)) +#define sigma0(x) (Rot32(x, 7) ^ Rot32(x, 18) ^ ((x) >> 3)) +#define sigma1(x) (Rot32(x, 17) ^ Rot32(x, 19) ^ ((x) >> 10)) + +static const uint32_t SHA256_K[64] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +}; + +static void +SHA256Transform(uint32_t *H, const uint8_t *cp) +{ + uint32_t a, b, c, d, e, f, g, h, t, T1, T2, W[64]; + + for (t = 0; t < 16; t++, cp += 4) + W[t] = (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | cp[3]; + + for (t = 16; t < 64; t++) + W[t] = sigma1(W[t - 2]) + W[t - 7] + + sigma0(W[t - 15]) + W[t - 16]; + + a = H[0]; b = H[1]; c = H[2]; d = H[3]; + e = H[4]; f = H[5]; g = H[6]; h = H[7]; + + for (t = 0; t < 64; t++) { + T1 = h + SIGMA1(e) + Ch(e, f, g) + SHA256_K[t] + W[t]; + T2 = SIGMA0(a) + Maj(a, b, c); + h = g; g = f; f = e; e = d + T1; + d = c; c = b; b = a; a = T1 + T2; + } + + H[0] += a; H[1] += b; H[2] += c; H[3] += d; + H[4] += e; H[5] += f; H[6] += g; H[7] += h; +} + +void +zio_checksum_SHA256(const void *buf, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp) +{ + uint32_t H[8] = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, + 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 }; + uint8_t pad[128]; + unsigned padsize = size & 63; + unsigned i; + + for (i = 0; i < size - padsize; i += 64) + SHA256Transform(H, (uint8_t *)buf + i); + + for (i = 0; i < padsize; i++) + pad[i] = ((uint8_t *)buf)[i]; + + for (pad[padsize++] = 0x80; (padsize & 63) != 56; padsize++) + pad[padsize] = 0; + + for (i = 0; i < 8; i++) + pad[padsize++] = (size << 3) >> (56 - 8 * i); + + for (i = 0; i < padsize; i += 64) + SHA256Transform(H, pad + i); + + zcp->zc_word[0] = grub_cpu_to_zfs64((uint64_t)H[0] << 32 | H[1], + endian); + zcp->zc_word[1] = grub_cpu_to_zfs64((uint64_t)H[2] << 32 | H[3], + endian); + zcp->zc_word[2] = grub_cpu_to_zfs64((uint64_t)H[4] << 32 | H[5], + endian); + zcp->zc_word[3] = grub_cpu_to_zfs64((uint64_t)H[6] << 32 | H[7], + endian); +} diff --git a/include/config_cmd_all.h b/include/config_cmd_all.h index 55f4f7a..5933ae9 100644 --- a/include/config_cmd_all.h +++ b/include/config_cmd_all.h @@ -36,6 +36,7 @@ #define CONFIG_CMD_ELF /* ELF (VxWorks) load/boot cmd */ #define CONFIG_CMD_EXT2 /* EXT2 Support */ #define CONFIG_CMD_FAT /* FAT support */ +#define CONFIG_CMD_ZFS /* ZFS support */ #define CONFIG_CMD_FDC /* Floppy Disk Support */ #define CONFIG_CMD_FDOS /* Floppy DOS support */ #define CONFIG_CMD_FLASH /* flinfo, erase, protect */ diff --git a/include/zfs/dmu.h b/include/zfs/dmu.h new file mode 100644 index 0000000..bee317e --- /dev/null +++ b/include/zfs/dmu.h @@ -0,0 +1,119 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_H +#define _SYS_DMU_H + +/* + * This file describes the interface that the DMU provides for its + * consumers. + * + * The DMU also interacts with the SPA. That interface is described in + * dmu_spa.h. + */ +typedef enum dmu_object_type { + DMU_OT_NONE, + /* general: */ + DMU_OT_OBJECT_DIRECTORY, /* ZAP */ + DMU_OT_OBJECT_ARRAY, /* UINT64 */ + DMU_OT_PACKED_NVLIST, /* UINT8 (XDR by nvlist_pack/unpack) */ + DMU_OT_PACKED_NVLIST_SIZE, /* UINT64 */ + DMU_OT_BPLIST, /* UINT64 */ + DMU_OT_BPLIST_HDR, /* UINT64 */ + /* spa: */ + DMU_OT_SPACE_MAP_HEADER, /* UINT64 */ + DMU_OT_SPACE_MAP, /* UINT64 */ + /* zil: */ + DMU_OT_INTENT_LOG, /* UINT64 */ + /* dmu: */ + DMU_OT_DNODE, /* DNODE */ + DMU_OT_OBJSET, /* OBJSET */ + /* dsl: */ + DMU_OT_DSL_DIR, /* UINT64 */ + DMU_OT_DSL_DIR_CHILD_MAP, /* ZAP */ + DMU_OT_DSL_DS_SNAP_MAP, /* ZAP */ + DMU_OT_DSL_PROPS, /* ZAP */ + DMU_OT_DSL_DATASET, /* UINT64 */ + /* zpl: */ + DMU_OT_ZNODE, /* ZNODE */ + DMU_OT_OLDACL, /* OLD ACL */ + DMU_OT_PLAIN_FILE_CONTENTS, /* UINT8 */ + DMU_OT_DIRECTORY_CONTENTS, /* ZAP */ + DMU_OT_MASTER_NODE, /* ZAP */ + DMU_OT_UNLINKED_SET, /* ZAP */ + /* zvol: */ + DMU_OT_ZVOL, /* UINT8 */ + DMU_OT_ZVOL_PROP, /* ZAP */ + /* other; for testing only! */ + DMU_OT_PLAIN_OTHER, /* UINT8 */ + DMU_OT_UINT64_OTHER, /* UINT64 */ + DMU_OT_ZAP_OTHER, /* ZAP */ + /* new object types: */ + DMU_OT_ERROR_LOG, /* ZAP */ + DMU_OT_SPA_HISTORY, /* UINT8 */ + DMU_OT_SPA_HISTORY_OFFSETS, /* spa_his_phys_t */ + DMU_OT_POOL_PROPS, /* ZAP */ + DMU_OT_DSL_PERMS, /* ZAP */ + DMU_OT_ACL, /* ACL */ + DMU_OT_SYSACL, /* SYSACL */ + DMU_OT_FUID, /* FUID table (Packed NVLIST UINT8) */ + DMU_OT_FUID_SIZE, /* FUID table size UINT64 */ + DMU_OT_NEXT_CLONES, /* ZAP */ + DMU_OT_SCRUB_QUEUE, /* ZAP */ + DMU_OT_USERGROUP_USED, /* ZAP */ + DMU_OT_USERGROUP_QUOTA, /* ZAP */ + DMU_OT_USERREFS, /* ZAP */ + DMU_OT_DDT_ZAP, /* ZAP */ + DMU_OT_DDT_STATS, /* ZAP */ + DMU_OT_SA, /* System attr */ + DMU_OT_SA_MASTER_NODE, /* ZAP */ + DMU_OT_SA_ATTR_REGISTRATION, /* ZAP */ + DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ + DMU_OT_NUMTYPES +} dmu_object_type_t; + +typedef enum dmu_objset_type { + DMU_OST_NONE, + DMU_OST_META, + DMU_OST_ZFS, + DMU_OST_ZVOL, + DMU_OST_OTHER, /* For testing only! */ + DMU_OST_ANY, /* Be careful! */ + DMU_OST_NUMTYPES +} dmu_objset_type_t; + +/* + * The names of zap entries in the DIRECTORY_OBJECT of the MOS. + */ +#define DMU_POOL_DIRECTORY_OBJECT 1 +#define DMU_POOL_CONFIG "config" +#define DMU_POOL_ROOT_DATASET "root_dataset" +#define DMU_POOL_SYNC_BPLIST "sync_bplist" +#define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" +#define DMU_POOL_ERRLOG_LAST "errlog_last" +#define DMU_POOL_SPARES "spares" +#define DMU_POOL_DEFLATE "deflate" +#define DMU_POOL_HISTORY "history" +#define DMU_POOL_PROPS "pool_props" +#define DMU_POOL_L2CACHE "l2cache" + +#endif /* _SYS_DMU_H */ diff --git a/include/zfs/dmu_objset.h b/include/zfs/dmu_objset.h new file mode 100644 index 0000000..176cad7 --- /dev/null +++ b/include/zfs/dmu_objset.h @@ -0,0 +1,43 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * Copyright (C) 2010 Robert Millan rmh@gnu.org + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_OBJSET_H +#define _SYS_DMU_OBJSET_H + +#include <zfs/zil.h> + +#define OBJSET_PHYS_SIZE 2048 +#define OBJSET_PHYS_SIZE_V14 1024 + +typedef struct objset_phys { + dnode_phys_t os_meta_dnode; + zil_header_t os_zil_header; + uint64_t os_type; + uint64_t os_flags; + char os_pad[OBJSET_PHYS_SIZE - sizeof(dnode_phys_t)*3 - + sizeof(zil_header_t) - sizeof(uint64_t)*2]; + dnode_phys_t os_userused_dnode; + dnode_phys_t os_groupused_dnode; +} objset_phys_t; + +#endif /* _SYS_DMU_OBJSET_H */ diff --git a/include/zfs/dnode.h b/include/zfs/dnode.h new file mode 100644 index 0000000..9ec3d43 --- /dev/null +++ b/include/zfs/dnode.h @@ -0,0 +1,80 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DNODE_H +#define _SYS_DNODE_H + +#include <zfs/spa.h> + +/* + * Fixed constants. + */ +#define DNODE_SHIFT 9 /* 512 bytes */ +#define DN_MIN_INDBLKSHIFT 10 /* 1k */ +#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DNODE_BLOCK_SHIFT 14 /* 16k */ +#define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ +#define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ +#define DN_MAX_OFFSET_SHIFT 64 /* 2^64 bytes in a dnode */ + +/* + * Derived constants. + */ +#define DNODE_SIZE (1 << DNODE_SHIFT) +#define DN_MAX_NBLKPTR ((DNODE_SIZE - DNODE_CORE_SIZE) >> SPA_BLKPTRSHIFT) +#define DN_MAX_BONUSLEN (DNODE_SIZE - DNODE_CORE_SIZE - (1 << SPA_BLKPTRSHIFT)) +#define DN_MAX_OBJECT (1ULL << DN_MAX_OBJECT_SHIFT) + +#define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) +#define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) +#define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) + +#define DNODE_FLAG_SPILL_BLKPTR (1<<2) + +#define DN_BONUS(dnp) ((void *)((dnp)->dn_bonus + \ + (((dnp)->dn_nblkptr - 1) * sizeof(blkptr_t)))) + +typedef struct dnode_phys { + uint8_t dn_type; /* dmu_object_type_t */ + uint8_t dn_indblkshift; /* ln2(indirect block size) */ + uint8_t dn_nlevels; /* 1=dn_blkptr->data blocks */ + uint8_t dn_nblkptr; /* length of dn_blkptr */ + uint8_t dn_bonustype; /* type of data in bonus buffer */ + uint8_t dn_checksum; /* ZIO_CHECKSUM type */ + uint8_t dn_compress; /* ZIO_COMPRESS type */ + uint8_t dn_flags; /* DNODE_FLAG_* */ + uint16_t dn_datablkszsec; /* data block size in 512b sectors */ + uint16_t dn_bonuslen; /* length of dn_bonus */ + uint8_t dn_pad2[4]; + + /* accounting is protected by dn_dirty_mtx */ + uint64_t dn_maxblkid; /* largest allocated block ID */ + uint64_t dn_used; /* bytes (or sectors) of disk space */ + + uint64_t dn_pad3[4]; + + blkptr_t dn_blkptr[1]; + uint8_t dn_bonus[DN_MAX_BONUSLEN - sizeof(blkptr_t)]; + blkptr_t dn_spill; +} dnode_phys_t; + +#endif /* _SYS_DNODE_H */ diff --git a/include/zfs/dsl_dataset.h b/include/zfs/dsl_dataset.h new file mode 100644 index 0000000..c6de7ab --- /dev/null +++ b/include/zfs/dsl_dataset.h @@ -0,0 +1,52 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DATASET_H +#define _SYS_DSL_DATASET_H + +typedef struct dsl_dataset_phys { + uint64_t ds_dir_obj; + uint64_t ds_prev_snap_obj; + uint64_t ds_prev_snap_txg; + uint64_t ds_next_snap_obj; + uint64_t ds_snapnames_zapobj; /* zap obj of snaps; ==0 for snaps */ + uint64_t ds_num_children; /* clone/snap children; ==0 for head */ + uint64_t ds_creation_time; /* seconds since 1970 */ + uint64_t ds_creation_txg; + uint64_t ds_deadlist_obj; + uint64_t ds_used_bytes; + uint64_t ds_compressed_bytes; + uint64_t ds_uncompressed_bytes; + uint64_t ds_unique_bytes; /* only relevant to snapshots */ + /* + * The ds_fsid_guid is a 56-bit ID that can change to avoid + * collisions. The ds_guid is a 64-bit ID that will never + * change, so there is a small probability that it will collide. + */ + uint64_t ds_fsid_guid; + uint64_t ds_guid; + uint64_t ds_flags; + blkptr_t ds_bp; + uint64_t ds_pad[8]; /* pad out to 320 bytes for good measure */ +} dsl_dataset_phys_t; + +#endif /* _SYS_DSL_DATASET_H */ diff --git a/include/zfs/dsl_dir.h b/include/zfs/dsl_dir.h new file mode 100644 index 0000000..c04e0b6 --- /dev/null +++ b/include/zfs/dsl_dir.h @@ -0,0 +1,48 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DIR_H +#define _SYS_DSL_DIR_H + +typedef struct dsl_dir_phys { + uint64_t dd_creation_time; /* not actually used */ + uint64_t dd_head_dataset_obj; + uint64_t dd_parent_obj; + uint64_t dd_clone_parent_obj; + uint64_t dd_child_dir_zapobj; + /* + * how much space our children are accounting for; for leaf + * datasets, == physical space used by fs + snaps + */ + uint64_t dd_used_bytes; + uint64_t dd_compressed_bytes; + uint64_t dd_uncompressed_bytes; + /* Administrative quota setting */ + uint64_t dd_quota; + /* Administrative reservation setting */ + uint64_t dd_reserved; + uint64_t dd_props_zapobj; + uint64_t dd_deleg_zapobj; /* dataset permissions */ + uint64_t dd_pad[20]; /* pad out to 256 bytes for good measure */ +} dsl_dir_phys_t; + +#endif /* _SYS_DSL_DIR_H */ diff --git a/include/zfs/sa_impl.h b/include/zfs/sa_impl.h new file mode 100644 index 0000000..4ec49fe --- /dev/null +++ b/include/zfs/sa_impl.h @@ -0,0 +1,34 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ +#ifndef _SYS_SA_IMPL_H +#define _SYS_SA_IMPL_H + +typedef struct sa_hdr_phys { + uint32_t sa_magic; + uint16_t sa_layout_info; + uint16_t sa_lengths[1]; +} sa_hdr_phys_t; + +#define SA_HDR_SIZE(hdr) BF32_GET_SB(hdr->sa_layout_info, 10, 16, 3, 0) +#define SA_SIZE_OFFSET 0x8 + +#endif /* _SYS_SA_IMPL_H */ diff --git a/include/zfs/spa.h b/include/zfs/spa.h new file mode 100644 index 0000000..100e2a6 --- /dev/null +++ b/include/zfs/spa.h @@ -0,0 +1,311 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * Copyright 2010 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef GRUB_ZFS_SPA_HEADER +#define GRUB_ZFS_SPA_HEADER 1 + +typedef enum grub_zfs_endian { + UNKNOWN_ENDIAN = -2, + LITTLE_ENDIAN = -1, + BIG_ENDIAN = 0 +} grub_zfs_endian_t; + + +#define grub_zfs_to_cpu16(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu16(x) \ + : grub_le_to_cpu16(x)) +#define grub_cpu_to_zfs16(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be16(x) \ + : grub_cpu_to_le16(x)) + +#define grub_zfs_to_cpu32(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu32(x) \ + : grub_le_to_cpu32(x)) +#define grub_cpu_to_zfs32(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be32(x) \ + : grub_cpu_to_le32(x)) + +#define grub_zfs_to_cpu64(x, a) (((a) == BIG_ENDIAN) ? grub_be_to_cpu64(x) \ + : grub_le_to_cpu64(x)) +#define grub_cpu_to_zfs64(x, a) (((a) == BIG_ENDIAN) ? grub_cpu_to_be64(x) \ + : grub_cpu_to_le64(x)) + +/* + * General-purpose 32-bit and 64-bit bitfield encodings. + */ +#define BF32_DECODE(x, low, len) P2PHASE((x) >> (low), 1U << (len)) +#define BF64_DECODE(x, low, len) P2PHASE((x) >> (low), 1ULL << (len)) +#define BF32_ENCODE(x, low, len) (P2PHASE((x), 1U << (len)) << (low)) +#define BF64_ENCODE(x, low, len) (P2PHASE((x), 1ULL << (len)) << (low)) + +#define BF32_GET(x, low, len) BF32_DECODE(x, low, len) +#define BF64_GET(x, low, len) BF64_DECODE(x, low, len) + +#define BF32_SET(x, low, len, val) \ + ((x) ^= BF32_ENCODE((x >> low) ^ (val), low, len)) +#define BF64_SET(x, low, len, val) \ + ((x) ^= BF64_ENCODE((x >> low) ^ (val), low, len)) + +#define BF32_GET_SB(x, low, len, shift, bias) \ + ((BF32_GET(x, low, len) + (bias)) << (shift)) +#define BF64_GET_SB(x, low, len, shift, bias) \ + ((BF64_GET(x, low, len) + (bias)) << (shift)) + +#define BF32_SET_SB(x, low, len, shift, bias, val) \ + BF32_SET(x, low, len, ((val) >> (shift)) - (bias)) +#define BF64_SET_SB(x, low, len, shift, bias, val) \ + BF64_SET(x, low, len, ((val) >> (shift)) - (bias)) + +/* + * We currently support nine block sizes, from 512 bytes to 128K. + * We could go higher, but the benefits are near-zero and the cost + * of COWing a giant block to modify one byte would become excessive. + */ +#define SPA_MINBLOCKSHIFT 9 +#define SPA_MAXBLOCKSHIFT 17 +#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT) +#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT) + +#define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) + +/* + * Size of block to hold the configuration data (a packed nvlist) + */ +#define SPA_CONFIG_BLOCKSIZE (1 << 14) + +/* + * The DVA size encodings for LSIZE and PSIZE support blocks up to 32MB. + * The ASIZE encoding should be at least 64 times larger (6 more bits) + * to support up to 4-way RAID-Z mirror mode with worst-case gang block + * overhead, three DVAs per bp, plus one more bit in case we do anything + * else that expands the ASIZE. + */ +#define SPA_LSIZEBITS 16 /* LSIZE up to 32M (2^16 * 512) */ +#define SPA_PSIZEBITS 16 /* PSIZE up to 32M (2^16 * 512) */ +#define SPA_ASIZEBITS 24 /* ASIZE up to 64 times larger */ + +/* + * All SPA data is represented by 128-bit data virtual addresses (DVAs). + * The members of the dva_t should be considered opaque outside the SPA. + */ +typedef struct dva { + uint64_t dva_word[2]; +} dva_t; + +/* + * Each block has a 256-bit checksum -- strong enough for cryptographic hashes. + */ +typedef struct zio_cksum { + uint64_t zc_word[4]; +} zio_cksum_t; + +/* + * Each block is described by its DVAs, time of birth, checksum, etc. + * The word-by-word, bit-by-bit layout of the blkptr is as follows: + * + * 64 56 48 40 32 24 16 8 0 + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 0 | vdev1 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 1 |G| offset1 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 2 | vdev2 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 3 |G| offset2 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 4 | vdev3 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 5 |G| offset3 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 6 |BDX|lvl| type | cksum | comp | PSIZE | LSIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 7 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 8 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 9 | physical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * a | logical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * b | fill count | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * c | checksum[0] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * d | checksum[1] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * e | checksum[2] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * f | checksum[3] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * + * Legend: + * + * vdev virtual device ID + * offset offset into virtual device + * LSIZE logical size + * PSIZE physical size (after compression) + * ASIZE allocated size (including RAID-Z parity and gang block headers) + * GRID RAID-Z layout information (reserved for future use) + * cksum checksum function + * comp compression function + * G gang block indicator + * B byteorder (endianness) + * D dedup + * X unused + * lvl level of indirection + * type DMU object type + * phys birth txg of block allocation; zero if same as logical birth txg + * log. birth transaction group in which the block was logically born + * fill count number of non-zero blocks under this bp + * checksum[4] 256-bit checksum of the data this bp describes + */ +#define SPA_BLKPTRSHIFT 7 /* blkptr_t is 128 bytes */ +#define SPA_DVAS_PER_BP 3 /* Number of DVAs in a bp */ + +typedef struct blkptr { + dva_t blk_dva[SPA_DVAS_PER_BP]; /* Data Virtual Addresses */ + uint64_t blk_prop; /* size, compression, type, etc */ + uint64_t blk_pad[2]; /* Extra space for the future */ + uint64_t blk_phys_birth; /* txg when block was allocated */ + uint64_t blk_birth; /* transaction group at birth */ + uint64_t blk_fill; /* fill count */ + zio_cksum_t blk_cksum; /* 256-bit checksum */ +} blkptr_t; + +/* + * Macros to get and set fields in a bp or DVA. + */ +#define DVA_GET_ASIZE(dva) \ + BF64_GET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0) +#define DVA_SET_ASIZE(dva, x) \ + BF64_SET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0, x) + +#define DVA_GET_GRID(dva) BF64_GET((dva)->dva_word[0], 24, 8) +#define DVA_SET_GRID(dva, x) BF64_SET((dva)->dva_word[0], 24, 8, x) + +#define DVA_GET_VDEV(dva) BF64_GET((dva)->dva_word[0], 32, 32) +#define DVA_SET_VDEV(dva, x) BF64_SET((dva)->dva_word[0], 32, 32, x) + +#define DVA_GET_GANG(dva) BF64_GET((dva)->dva_word[1], 63, 1) +#define DVA_SET_GANG(dva, x) BF64_SET((dva)->dva_word[1], 63, 1, x) + +#define BP_GET_LSIZE(bp) \ + BF64_GET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1) +#define BP_SET_LSIZE(bp, x) \ + BF64_SET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1, x) + +#define BP_GET_COMPRESS(bp) BF64_GET((bp)->blk_prop, 32, 8) +#define BP_SET_COMPRESS(bp, x) BF64_SET((bp)->blk_prop, 32, 8, x) + +#define BP_GET_CHECKSUM(bp) BF64_GET((bp)->blk_prop, 40, 8) +#define BP_SET_CHECKSUM(bp, x) BF64_SET((bp)->blk_prop, 40, 8, x) + +#define BP_GET_TYPE(bp) BF64_GET((bp)->blk_prop, 48, 8) +#define BP_SET_TYPE(bp, x) BF64_SET((bp)->blk_prop, 48, 8, x) + +#define BP_GET_LEVEL(bp) BF64_GET((bp)->blk_prop, 56, 5) +#define BP_SET_LEVEL(bp, x) BF64_SET((bp)->blk_prop, 56, 5, x) + +#define BP_GET_PROP_BIT_61(bp) BF64_GET((bp)->blk_prop, 61, 1) +#define BP_SET_PROP_BIT_61(bp, x) BF64_SET((bp)->blk_prop, 61, 1, x) + +#define BP_GET_DEDUP(bp) BF64_GET((bp)->blk_prop, 62, 1) +#define BP_SET_DEDUP(bp, x) BF64_SET((bp)->blk_prop, 62, 1, x) + +#define BP_GET_BYTEORDER(bp) (0 - BF64_GET((bp)->blk_prop, 63, 1)) +#define BP_SET_BYTEORDER(bp, x) BF64_SET((bp)->blk_prop, 63, 1, x) + +#define BP_PHYSICAL_BIRTH(bp) \ + ((bp)->blk_phys_birth ? (bp)->blk_phys_birth : (bp)->blk_birth) + +#define BP_SET_BIRTH(bp, logical, physical) \ + { \ + (bp)->blk_birth = (logical); \ + (bp)->blk_phys_birth = ((logical) == (physical) ? 0 : (physical)); \ + } + +#define BP_GET_ASIZE(bp) \ + (DVA_GET_ASIZE(&(bp)->blk_dva[0]) + DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_GET_UCSIZE(bp) \ + ((BP_GET_LEVEL(bp) > 0 || dmu_ot[BP_GET_TYPE(bp)].ot_metadata) ? \ + BP_GET_PSIZE(bp) : BP_GET_LSIZE(bp)); + +#define BP_GET_NDVAS(bp) \ + (!!DVA_GET_ASIZE(&(bp)->blk_dva[0]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_COUNT_GANG(bp) \ + (DVA_GET_GANG(&(bp)->blk_dva[0]) + \ + DVA_GET_GANG(&(bp)->blk_dva[1]) + \ + DVA_GET_GANG(&(bp)->blk_dva[2])) + +#define DVA_EQUAL(dva1, dva2) \ + ((dva1)->dva_word[1] == (dva2)->dva_word[1] && \ + (dva1)->dva_word[0] == (dva2)->dva_word[0]) + +#define BP_EQUAL(bp1, bp2) \ + (BP_PHYSICAL_BIRTH(bp1) == BP_PHYSICAL_BIRTH(bp2) && \ + DVA_EQUAL(&(bp1)->blk_dva[0], &(bp2)->blk_dva[0]) && \ + DVA_EQUAL(&(bp1)->blk_dva[1], &(bp2)->blk_dva[1]) && \ + DVA_EQUAL(&(bp1)->blk_dva[2], &(bp2)->blk_dva[2])) + +#define ZIO_CHECKSUM_EQUAL(zc1, zc2) \ + (0 == (((zc1).zc_word[0] - (zc2).zc_word[0]) | \ + ((zc1).zc_word[1] - (zc2).zc_word[1]) | \ + ((zc1).zc_word[2] - (zc2).zc_word[2]) | \ + ((zc1).zc_word[3] - (zc2).zc_word[3]))) + +#define DVA_IS_VALID(dva) (DVA_GET_ASIZE(dva) != 0) + +#define ZIO_SET_CHECKSUM(zcp, w0, w1, w2, w3) \ + { \ + (zcp)->zc_word[0] = w0; \ + (zcp)->zc_word[1] = w1; \ + (zcp)->zc_word[2] = w2; \ + (zcp)->zc_word[3] = w3; \ + } + +#define BP_IDENTITY(bp) (&(bp)->blk_dva[0]) +#define BP_IS_GANG(bp) DVA_GET_GANG(BP_IDENTITY(bp)) +#define BP_IS_HOLE(bp) ((bp)->blk_birth == 0) + +/* BP_IS_RAIDZ(bp) assumes no block compression */ +#define BP_IS_RAIDZ(bp) (DVA_GET_ASIZE(&(bp)->blk_dva[0]) > \ + BP_GET_PSIZE(bp)) + +#define BP_ZERO(bp) \ + { \ + (bp)->blk_dva[0].dva_word[0] = 0; \ + (bp)->blk_dva[0].dva_word[1] = 0; \ + (bp)->blk_dva[1].dva_word[0] = 0; \ + (bp)->blk_dva[1].dva_word[1] = 0; \ + (bp)->blk_dva[2].dva_word[0] = 0; \ + (bp)->blk_dva[2].dva_word[1] = 0; \ + (bp)->blk_prop = 0; \ + (bp)->blk_pad[0] = 0; \ + (bp)->blk_pad[1] = 0; \ + (bp)->blk_phys_birth = 0; \ + (bp)->blk_birth = 0; \ + (bp)->blk_fill = 0; \ + ZIO_SET_CHECKSUM(&(bp)->blk_cksum, 0, 0, 0, 0); \ + } + +#define BP_SPRINTF_LEN 320 + +#endif /* ! GRUB_ZFS_SPA_HEADER */ diff --git a/include/zfs/uberblock_impl.h b/include/zfs/uberblock_impl.h new file mode 100644 index 0000000..12daf98 --- /dev/null +++ b/include/zfs/uberblock_impl.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_UBERBLOCK_IMPL_H +#define _SYS_UBERBLOCK_IMPL_H + +#define UBMAX(a, b) ((a) > (b) ? (a) : (b)) + +/* + * The uberblock version is incremented whenever an incompatible on-disk + * format change is made to the SPA, DMU, or ZAP. + * + * Note: the first two fields should never be moved. When a storage pool + * is opened, the uberblock must be read off the disk before the version + * can be checked. If the ub_version field is moved, we may not detect + * version mismatch. If the ub_magic field is moved, applications that + * expect the magic number in the first word won't work. + */ +#define UBERBLOCK_MAGIC 0x00bab10c /* oo-ba-bloc! */ +#define UBERBLOCK_SHIFT 10 /* up to 1K */ + +typedef struct uberblock { + uint64_t ub_magic; /* UBERBLOCK_MAGIC */ + uint64_t ub_version; /* ZFS_VERSION */ + uint64_t ub_txg; /* txg of last sync */ + uint64_t ub_guid_sum; /* sum of all vdev guids */ + uint64_t ub_timestamp; /* UTC time of last sync */ + blkptr_t ub_rootbp; /* MOS objset_phys_t */ +} uberblock_t; + +#define VDEV_UBERBLOCK_SHIFT(as) UBMAX(as, UBERBLOCK_SHIFT) +#define UBERBLOCK_SIZE(as) (1ULL << VDEV_UBERBLOCK_SHIFT(as)) + +/* Number of uberblocks that can fit in the ring at a given ashift */ +#define UBERBLOCK_COUNT(as) (VDEV_UBERBLOCK_RING >> VDEV_UBERBLOCK_SHIFT(as)) + +#endif /* _SYS_UBERBLOCK_IMPL_H */ diff --git a/include/zfs/vdev_impl.h b/include/zfs/vdev_impl.h new file mode 100644 index 0000000..97033c9 --- /dev/null +++ b/include/zfs/vdev_impl.h @@ -0,0 +1,69 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_VDEV_IMPL_H +#define _SYS_VDEV_IMPL_H + +#define VDEV_SKIP_SIZE (8 << 10) +#define VDEV_BOOT_HEADER_SIZE (8 << 10) +#define VDEV_PHYS_SIZE (112 << 10) +#define VDEV_UBERBLOCK_RING (128 << 10) + +/* ZFS boot block */ +#define VDEV_BOOT_MAGIC 0x2f5b007b10cULL +#define VDEV_BOOT_VERSION 1 /* version number */ + +typedef struct vdev_boot_header { + uint64_t vb_magic; /* VDEV_BOOT_MAGIC */ + uint64_t vb_version; /* VDEV_BOOT_VERSION */ + uint64_t vb_offset; /* start offset (bytes) */ + uint64_t vb_size; /* size (bytes) */ + char vb_pad[VDEV_BOOT_HEADER_SIZE - 4 * sizeof(uint64_t)]; +} vdev_boot_header_t; + +typedef struct vdev_phys { + char vp_nvlist[VDEV_PHYS_SIZE - sizeof(zio_eck_t)]; + zio_eck_t vp_zbt; +} vdev_phys_t; + +typedef struct vdev_label { + char vl_pad[VDEV_SKIP_SIZE]; /* 8K */ + vdev_boot_header_t vl_boot_header; /* 8K */ + vdev_phys_t vl_vdev_phys; /* 112K */ + char vl_uberblock[VDEV_UBERBLOCK_RING]; /* 128K */ +} vdev_label_t; /* 256K total */ + +/* + * Size and offset of embedded boot loader region on each label. + * The total size of the first two labels plus the boot area is 4MB. + */ +#define VDEV_BOOT_OFFSET (2 * sizeof(vdev_label_t)) +#define VDEV_BOOT_SIZE (7ULL << 19) /* 3.5M */ + +/* + * Size of label regions at the start and end of each leaf device. + */ +#define VDEV_LABEL_START_SIZE (2 * sizeof(vdev_label_t) + VDEV_BOOT_SIZE) +#define VDEV_LABEL_END_SIZE (2 * sizeof(vdev_label_t)) +#define VDEV_LABELS 4 + +#endif /* _SYS_VDEV_IMPL_H */ diff --git a/include/zfs/zap_impl.h b/include/zfs/zap_impl.h new file mode 100644 index 0000000..65e9311 --- /dev/null +++ b/include/zfs/zap_impl.h @@ -0,0 +1,112 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_IMPL_H +#define _SYS_ZAP_IMPL_H + +#define ZAP_MAGIC 0x2F52AB2ABULL + +#define ZAP_HASHBITS 28 +#define MZAP_ENT_LEN 64 +#define MZAP_NAME_LEN (MZAP_ENT_LEN - 8 - 4 - 2) +#define MZAP_MAX_BLKSHIFT SPA_MAXBLOCKSHIFT +#define MZAP_MAX_BLKSZ (1 << MZAP_MAX_BLKSHIFT) + +typedef struct mzap_ent_phys { + uint64_t mze_value; + uint32_t mze_cd; + uint16_t mze_pad; /* in case we want to chain them someday */ + char mze_name[MZAP_NAME_LEN]; +} mzap_ent_phys_t; + +typedef struct mzap_phys { + uint64_t mz_block_type; /* ZBT_MICRO */ + uint64_t mz_salt; + uint64_t mz_pad[6]; + mzap_ent_phys_t mz_chunk[1]; + /* actually variable size depending on block size */ +} mzap_phys_t; + +/* + * The (fat) zap is stored in one object. It is an array of + * 1<<FZAP_BLOCK_SHIFT byte blocks. The layout looks like one of: + * + * ptrtbl fits in first block: + * [zap_phys_t zap_ptrtbl_shift < 6] [zap_leaf_t] ... + * + * ptrtbl too big for first block: + * [zap_phys_t zap_ptrtbl_shift >= 6] [zap_leaf_t] [ptrtbl] ... + * + */ + +#define ZBT_LEAF ((1ULL << 63) + 0) +#define ZBT_HEADER ((1ULL << 63) + 1) +#define ZBT_MICRO ((1ULL << 63) + 3) +/* any other values are ptrtbl blocks */ + +/* + * the embedded pointer table takes up half a block: + * block size / entry size (2^3) / 2 + */ +#define ZAP_EMBEDDED_PTRTBL_SHIFT(zap) (FZAP_BLOCK_SHIFT(zap) - 3 - 1) + +/* + * The embedded pointer table starts half-way through the block. Since + * the pointer table itself is half the block, it starts at (64-bit) + * word number (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap)). + */ +#define ZAP_EMBEDDED_PTRTBL_ENT(zap, idx) \ + ((uint64_t *)(zap)->zap_f.zap_phys) \ + [(idx) + (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap))] + +/* + * TAKE NOTE: + * If zap_phys_t is modified, zap_byteswap() must be modified. + */ +typedef struct zap_phys { + uint64_t zap_block_type; /* ZBT_HEADER */ + uint64_t zap_magic; /* ZAP_MAGIC */ + + struct zap_table_phys { + uint64_t zt_blk; /* starting block number */ + uint64_t zt_numblks; /* number of blocks */ + uint64_t zt_shift; /* bits to index it */ + uint64_t zt_nextblk; /* next (larger) copy start block */ + uint64_t zt_blks_copied; /* number source blocks copied */ + } zap_ptrtbl; + + uint64_t zap_freeblk; /* the next free block */ + uint64_t zap_num_leafs; /* number of leafs */ + uint64_t zap_num_entries; /* number of entries */ + uint64_t zap_salt; /* salt to stir into hash function */ + uint64_t zap_normflags; /* flags for u8_textprep_str() */ + uint64_t zap_flags; /* zap_flag_t */ + /* + * This structure is followed by padding, and then the embedded + * pointer table. The embedded pointer table takes up second + * half of the block. It is accessed using the + * ZAP_EMBEDDED_PTRTBL_ENT() macro. + */ +} zap_phys_t; + +#endif /* _SYS_ZAP_IMPL_H */ diff --git a/include/zfs/zap_leaf.h b/include/zfs/zap_leaf.h new file mode 100644 index 0000000..4ddddb5 --- /dev/null +++ b/include/zfs/zap_leaf.h @@ -0,0 +1,103 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_LEAF_H +#define _SYS_ZAP_LEAF_H + +#define ZAP_LEAF_MAGIC 0x2AB1EAF + +/* chunk size = 24 bytes */ +#define ZAP_LEAF_CHUNKSIZE 24 + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +typedef enum zap_chunk_type { + ZAP_CHUNK_FREE = 253, + ZAP_CHUNK_ENTRY = 252, + ZAP_CHUNK_ARRAY = 251, + ZAP_CHUNK_TYPE_MAX = 250 +} zap_chunk_type_t; + +/* + * TAKE NOTE: + * If zap_leaf_phys_t is modified, zap_leaf_byteswap() must be modified. + */ +typedef struct zap_leaf_phys { + struct zap_leaf_header { + uint64_t lh_block_type; /* ZBT_LEAF */ + uint64_t lh_pad1; + uint64_t lh_prefix; /* hash prefix of this leaf */ + uint32_t lh_magic; /* ZAP_LEAF_MAGIC */ + uint16_t lh_nfree; /* number free chunks */ + uint16_t lh_nentries; /* number of entries */ + uint16_t lh_prefix_len; /* num bits used to id this */ + + /* above is accessable to zap, below is zap_leaf private */ + + uint16_t lh_freelist; /* chunk head of free list */ + uint8_t lh_pad2[12]; + } l_hdr; /* 2 24-byte chunks */ + + /* + * The header is followed by a hash table with + * ZAP_LEAF_HASH_NUMENTRIES(zap) entries. The hash table is + * followed by an array of ZAP_LEAF_NUMCHUNKS(zap) + * zap_leaf_chunk structures. These structures are accessed + * with the ZAP_LEAF_CHUNK() macro. + */ + + uint16_t l_hash[1]; +} zap_leaf_phys_t; + +typedef union zap_leaf_chunk { + struct zap_leaf_entry { + uint8_t le_type; /* always ZAP_CHUNK_ENTRY */ + uint8_t le_int_size; /* size of ints */ + uint16_t le_next; /* next entry in hash chain */ + uint16_t le_name_chunk; /* first chunk of the name */ + uint16_t le_name_length; /* bytes in name, incl null */ + uint16_t le_value_chunk; /* first chunk of the value */ + uint16_t le_value_length; /* value length in ints */ + uint32_t le_cd; /* collision differentiator */ + uint64_t le_hash; /* hash value of the name */ + } l_entry; + struct zap_leaf_array { + uint8_t la_type; /* always ZAP_CHUNK_ARRAY */ + union { + uint8_t la_array[ZAP_LEAF_ARRAY_BYTES]; + uint64_t la_array64; + } __attribute__ ((packed)); + uint16_t la_next; /* next blk or CHAIN_END */ + } l_array; + struct zap_leaf_free { + uint8_t lf_type; /* always ZAP_CHUNK_FREE */ + uint8_t lf_pad[ZAP_LEAF_ARRAY_BYTES]; + uint16_t lf_next; /* next in free list, or CHAIN_END */ + } l_free; +} zap_leaf_chunk_t; + +#endif /* _SYS_ZAP_LEAF_H */ diff --git a/include/zfs/zfs.h b/include/zfs/zfs.h new file mode 100644 index 0000000..b6d41c0 --- /dev/null +++ b/include/zfs/zfs.h @@ -0,0 +1,122 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004,2009 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ + /* + * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef GRUB_ZFS_HEADER +#define GRUB_ZFS_HEADER 1 + + +/* + * On-disk version number. + */ +#define SPA_VERSION 28ULL + +/* + * The following are configuration names used in the nvlist describing a pool's + * configuration. + */ +#define ZPOOL_CONFIG_VERSION "version" +#define ZPOOL_CONFIG_POOL_NAME "name" +#define ZPOOL_CONFIG_POOL_STATE "state" +#define ZPOOL_CONFIG_POOL_TXG "txg" +#define ZPOOL_CONFIG_POOL_GUID "pool_guid" +#define ZPOOL_CONFIG_CREATE_TXG "create_txg" +#define ZPOOL_CONFIG_TOP_GUID "top_guid" +#define ZPOOL_CONFIG_VDEV_TREE "vdev_tree" +#define ZPOOL_CONFIG_TYPE "type" +#define ZPOOL_CONFIG_CHILDREN "children" +#define ZPOOL_CONFIG_ID "id" +#define ZPOOL_CONFIG_GUID "guid" +#define ZPOOL_CONFIG_PATH "path" +#define ZPOOL_CONFIG_DEVID "devid" +#define ZPOOL_CONFIG_METASLAB_ARRAY "metaslab_array" +#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift" +#define ZPOOL_CONFIG_ASHIFT "ashift" +#define ZPOOL_CONFIG_ASIZE "asize" +#define ZPOOL_CONFIG_DTL "DTL" +#define ZPOOL_CONFIG_STATS "stats" +#define ZPOOL_CONFIG_WHOLE_DISK "whole_disk" +#define ZPOOL_CONFIG_ERRCOUNT "error_count" +#define ZPOOL_CONFIG_NOT_PRESENT "not_present" +#define ZPOOL_CONFIG_SPARES "spares" +#define ZPOOL_CONFIG_IS_SPARE "is_spare" +#define ZPOOL_CONFIG_NPARITY "nparity" +#define ZPOOL_CONFIG_PHYS_PATH "phys_path" +#define ZPOOL_CONFIG_L2CACHE "l2cache" +#define ZPOOL_CONFIG_HOLE_ARRAY "hole_array" +#define ZPOOL_CONFIG_VDEV_CHILDREN "vdev_children" +#define ZPOOL_CONFIG_IS_HOLE "is_hole" +#define ZPOOL_CONFIG_DDT_HISTOGRAM "ddt_histogram" +#define ZPOOL_CONFIG_DDT_OBJ_STATS "ddt_object_stats" +#define ZPOOL_CONFIG_DDT_STATS "ddt_stats" +/* + * The persistent vdev state is stored as separate values rather than a single + * 'vdev_state' entry. This is because a device can be in multiple states, such + * as offline and degraded. + */ +#define ZPOOL_CONFIG_OFFLINE "offline" +#define ZPOOL_CONFIG_FAULTED "faulted" +#define ZPOOL_CONFIG_DEGRADED "degraded" +#define ZPOOL_CONFIG_REMOVED "removed" + +#define VDEV_TYPE_ROOT "root" +#define VDEV_TYPE_MIRROR "mirror" +#define VDEV_TYPE_REPLACING "replacing" +#define VDEV_TYPE_RAIDZ "raidz" +#define VDEV_TYPE_DISK "disk" +#define VDEV_TYPE_FILE "file" +#define VDEV_TYPE_MISSING "missing" +#define VDEV_TYPE_HOLE "hole" +#define VDEV_TYPE_SPARE "spare" +#define VDEV_TYPE_L2CACHE "l2cache" + +/* + * pool state. The following states are written to disk as part of the normal + * SPA lifecycle: ACTIVE, EXPORTED, DESTROYED, SPARE, L2CACHE. The remaining + * states are software abstractions used at various levels to communicate pool + * state. + */ +typedef enum pool_state { + POOL_STATE_ACTIVE = 0, /* In active use */ + POOL_STATE_EXPORTED, /* Explicitly exported */ + POOL_STATE_DESTROYED, /* Explicitly destroyed */ + POOL_STATE_SPARE, /* Reserved for hot spare use */ + POOL_STATE_L2CACHE, /* Level 2 ARC device */ + POOL_STATE_UNINITIALIZED, /* Internal spa_t state */ + POOL_STATE_UNAVAIL, /* Internal libzfs state */ + POOL_STATE_POTENTIALLY_ACTIVE /* Internal libzfs state */ +} pool_state_t; + +struct grub_zfs_data; + +int grub_zfs_fetch_nvlist(device_t dev, char **nvlist); +int grub_zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj); + +char *grub_zfs_nvlist_lookup_string(char *nvlist, char *name); +char *grub_zfs_nvlist_lookup_nvlist(char *nvlist, char *name); +int grub_zfs_nvlist_lookup_uint64(char *nvlist, char *name, + uint64_t *out); +char *grub_zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index); +int grub_zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name); + +#endif /* ! GRUB_ZFS_HEADER */ diff --git a/include/zfs/zfs_acl.h b/include/zfs/zfs_acl.h new file mode 100644 index 0000000..66749af --- /dev/null +++ b/include/zfs/zfs_acl.h @@ -0,0 +1,55 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ACL_H +#define _SYS_FS_ZFS_ACL_H + +typedef struct zfs_oldace { + uint32_t z_fuid; /* "who" */ + uint32_t z_access_mask; /* access mask */ + uint16_t z_flags; /* flags, i.e inheritance */ + uint16_t z_type; /* type of entry allow/deny */ +} zfs_oldace_t; + +#define ACE_SLOT_CNT 6 + +typedef struct zfs_znode_acl_v0 { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_count; /* Number of ACEs */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_pad; /* pad */ + zfs_oldace_t z_ace_data[ACE_SLOT_CNT]; /* 6 standard ACEs */ +} zfs_znode_acl_v0_t; + +#define ZFS_ACE_SPACE (sizeof(zfs_oldace_t) * ACE_SLOT_CNT) + +typedef struct zfs_znode_acl { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_size; /* Number of bytes in ACL */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_count; /* ace count */ + uint8_t z_ace_data[ZFS_ACE_SPACE]; /* space for embedded ACEs */ +} zfs_znode_acl_t; + + +#endif /* _SYS_FS_ZFS_ACL_H */ diff --git a/include/zfs/zfs_znode.h b/include/zfs/zfs_znode.h new file mode 100644 index 0000000..e3265e3 --- /dev/null +++ b/include/zfs/zfs_znode.h @@ -0,0 +1,70 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ZNODE_H +#define _SYS_FS_ZFS_ZNODE_H + +#include <zfs/zfs_acl.h> + +#define MASTER_NODE_OBJ 1 +#define ZFS_ROOT_OBJ "ROOT" +#define ZPL_VERSION_STR "VERSION" +#define ZFS_SA_ATTRS "SA_ATTRS" + +#define ZPL_VERSION 5ULL + +#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48) + +/* + * This is the persistent portion of the znode. It is stored + * in the "bonus buffer" of the file. Short symbolic links + * are also stored in the bonus buffer. + */ +typedef struct znode_phys { + uint64_t zp_atime[2]; /* 0 - last file access time */ + uint64_t zp_mtime[2]; /* 16 - last file modification time */ + uint64_t zp_ctime[2]; /* 32 - last file change time */ + uint64_t zp_crtime[2]; /* 48 - creation time */ + uint64_t zp_gen; /* 64 - generation (txg of creation) */ + uint64_t zp_mode; /* 72 - file mode bits */ + uint64_t zp_size; /* 80 - size of file */ + uint64_t zp_parent; /* 88 - directory parent (`..') */ + uint64_t zp_links; /* 96 - number of links to file */ + uint64_t zp_xattr; /* 104 - DMU object for xattrs */ + uint64_t zp_rdev; /* 112 - dev_t for VBLK & VCHR files */ + uint64_t zp_flags; /* 120 - persistent flags */ + uint64_t zp_uid; /* 128 - file owner */ + uint64_t zp_gid; /* 136 - owning group */ + uint64_t zp_pad[4]; /* 144 - future */ + zfs_znode_acl_t zp_acl; /* 176 - 263 ACL */ + /* + * Data may pad out any remaining bytes in the znode buffer, eg: + * + * |<---------------------- dnode_phys (512) ------------------------>| + * |<-- dnode (192) --->|<----------- "bonus" buffer (320) ---------->| + * |<---- znode (264) ---->|<---- data (56) ---->| + * + * At present, we only use this space to store symbolic links. + */ +} znode_phys_t; + +#endif /* _SYS_FS_ZFS_ZNODE_H */ diff --git a/include/zfs/zil.h b/include/zfs/zil.h new file mode 100644 index 0000000..bc9d5e9 --- /dev/null +++ b/include/zfs/zil.h @@ -0,0 +1,56 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIL_H +#define _SYS_ZIL_H + +/* + * Intent log format: + * + * Each objset has its own intent log. The log header (zil_header_t) + * for objset N's intent log is kept in the Nth object of the SPA's + * intent_log objset. The log header points to a chain of log blocks, + * each of which contains log records (i.e., transactions) followed by + * a log block trailer (zil_trailer_t). The format of a log record + * depends on the record (or transaction) type, but all records begin + * with a common structure that defines the type, length, and txg. + */ + +/* + * Intent log header - this on disk structure holds fields to manage + * the log. All fields are 64 bit to easily handle cross architectures. + */ +typedef struct zil_header { + uint64_t zh_claim_txg; /* txg in which log blocks were claimed */ + uint64_t zh_replay_seq; /* highest replayed sequence number */ + blkptr_t zh_log; /* log chain */ + uint64_t zh_claim_seq; /* highest claimed sequence number */ + uint64_t zh_flags; /* header flags */ + uint64_t zh_pad[4]; +} zil_header_t; + +/* + * zh_flags bit settings + */ +#define ZIL_REPLAY_NEEDED 0x1 /* replay needed - internal only */ + +#endif /* _SYS_ZIL_H */ diff --git a/include/zfs/zio.h b/include/zfs/zio.h new file mode 100644 index 0000000..38f90d5 --- /dev/null +++ b/include/zfs/zio.h @@ -0,0 +1,92 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _ZIO_H +#define _ZIO_H + +#include <zfs/spa.h> + +#define ZEC_MAGIC 0x210da7ab10c7a11ULL /* zio data bloc tail */ + +typedef struct zio_eck { + uint64_t zec_magic; /* for validation, endianness */ + zio_cksum_t zec_cksum; /* 256-bit checksum */ +} zio_eck_t; + +/* + * Gang block headers are self-checksumming and contain an array + * of block pointers. + */ +#define SPA_GANGBLOCKSIZE SPA_MINBLOCKSIZE +#define SPA_GBH_NBLKPTRS ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t)) / sizeof(blkptr_t)) +#define SPA_GBH_FILLER ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t) - \ + (SPA_GBH_NBLKPTRS * sizeof(blkptr_t))) /\ + sizeof(uint64_t)) + +#define ZIO_GET_IOSIZE(zio) \ + (BP_IS_GANG((zio)->io_bp) ? \ + SPA_GANGBLOCKSIZE : BP_GET_PSIZE((zio)->io_bp)) + +typedef struct zio_gbh { + blkptr_t zg_blkptr[SPA_GBH_NBLKPTRS]; + uint64_t zg_filler[SPA_GBH_FILLER]; + zio_eck_t zg_tail; +} zio_gbh_phys_t; + +enum zio_checksum { + ZIO_CHECKSUM_INHERIT = 0, + ZIO_CHECKSUM_ON, + ZIO_CHECKSUM_OFF, + ZIO_CHECKSUM_LABEL, + ZIO_CHECKSUM_GANG_HEADER, + ZIO_CHECKSUM_ZILOG, + ZIO_CHECKSUM_FLETCHER_2, + ZIO_CHECKSUM_FLETCHER_4, + ZIO_CHECKSUM_SHA256, + ZIO_CHECKSUM_ZILOG2, + ZIO_CHECKSUM_FUNCTIONS +}; + +#define ZIO_CHECKSUM_ON_VALUE ZIO_CHECKSUM_FLETCHER_2 +#define ZIO_CHECKSUM_DEFAULT ZIO_CHECKSUM_ON + +enum zio_compress { + ZIO_COMPRESS_INHERIT = 0, + ZIO_COMPRESS_ON, + ZIO_COMPRESS_OFF, + ZIO_COMPRESS_LZJB, + ZIO_COMPRESS_EMPTY, + ZIO_COMPRESS_GZIP1, + ZIO_COMPRESS_GZIP2, + ZIO_COMPRESS_GZIP3, + ZIO_COMPRESS_GZIP4, + ZIO_COMPRESS_GZIP5, + ZIO_COMPRESS_GZIP6, + ZIO_COMPRESS_GZIP7, + ZIO_COMPRESS_GZIP8, + ZIO_COMPRESS_GZIP9, + ZIO_COMPRESS_FUNCTIONS +}; + +#endif /* _ZIO_H */ diff --git a/include/zfs/zio_checksum.h b/include/zfs/zio_checksum.h new file mode 100644 index 0000000..8ade44a --- /dev/null +++ b/include/zfs/zio_checksum.h @@ -0,0 +1,49 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 3 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIO_CHECKSUM_H +#define _SYS_ZIO_CHECKSUM_H + +/* + * Signature for checksum functions. + */ +typedef void zio_checksum_t(const void *data, uint64_t size, + grub_zfs_endian_t endian, zio_cksum_t *zcp); + +/* + * Information about each checksum function. + */ +typedef struct zio_checksum_info { + zio_checksum_t *ci_func; /* checksum function for each byteorder */ + int ci_correctable; /* number of correctable bits */ + int ci_eck; /* uses zio embedded checksum? */ + char *ci_name; /* descriptive name */ +} zio_checksum_info_t; + +extern void zio_checksum_SHA256(const void *, uint64_t, + grub_zfs_endian_t endian, zio_cksum_t *); +extern void fletcher_2(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); +extern void fletcher_4(const void *, uint64_t, grub_zfs_endian_t endian, + zio_cksum_t *); + +#endif /* _SYS_ZIO_CHECKSUM_H */ diff --git a/include/zfs_common.h b/include/zfs_common.h new file mode 100644 index 0000000..969dbf5 --- /dev/null +++ b/include/zfs_common.h @@ -0,0 +1,94 @@ +/* + * ZFS filesystem implementation in Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __ZFS_COMMON__ +#define __ZFS_COMMON__ + +#define SECTOR_SIZE 0x200 +#define SECTOR_BITS 9 + +#define grub_le_to_cpu16 le16_to_cpu +#define grub_be_to_cpu16 be16_to_cpu +#define grub_le_to_cpu32 le32_to_cpu +#define grub_be_to_cpu32 be32_to_cpu +#define grub_le_to_cpu64 le64_to_cpu +#define grub_be_to_cpu64 be64_to_cpu + +#define grub_cpu_to_le64 cpu_to_le64 +#define grub_cpu_to_be64 cpu_to_be64 + +enum zfs_errors { + ZFS_ERR_NONE = 0, + ZFS_ERR_NOT_IMPLEMENTED_YET = -1, + ZFS_ERR_BAD_FS = -2, + ZFS_ERR_OUT_OF_MEMORY = -3, + ZFS_ERR_FILE_NOT_FOUND = -4, + ZFS_ERR_BAD_FILE_TYPE = -5, + ZFS_ERR_OUT_OF_RANGE = -6, +}; + +struct zfs_filesystem { + + /* Block Device Descriptor */ + block_dev_desc_t *dev_desc; +}; + + +extern block_dev_desc_t *zfs_dev_desc; + +struct device_s { + uint64_t part_length; +}; +typedef struct device_s *device_t; + +struct zfs_file { + device_t device; + uint64_t size; + void *data; + uint64_t offset; +}; + +typedef struct zfs_file *zfs_file_t; + +struct zfs_dirhook_info { + int dir; + int mtimeset; + time_t mtime; + time_t mtime2; +}; + + + + +struct zfs_filesystem *zfsget_fs(void); +int init_fs(block_dev_desc_t *dev_desc); +void deinit_fs(block_dev_desc_t *dev_desc); +int zfs_open(zfs_file_t, const char *filename); +uint64_t zfs_read(zfs_file_t, char *buf, uint64_t len); +struct grub_zfs_data *zfs_mount(device_t); +int zfs_close(zfs_file_t); +int zfs_ls(device_t dev, const char *path, + int (*hook) (const char *, const struct zfs_dirhook_info *)); +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf); +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part); +void zfs_unmount(struct grub_zfs_data *data); +int lzjb_decompress(void *, void *, uint32_t, uint32_t); +#endif

Do I have to do anything special at this point, or can I assume everything is going according to plan?
Lund

Hi Lund,
On Thu, Jul 5, 2012 at 1:34 PM, Jorgen Lundman lundman@lundman.net wrote:
Do I have to do anything special at this point, or can I assume everything is going according to plan?
I've had a quick look - the intrusion into common code is minimal (additions to Makefiles) so the risk to U-Boot stability is extremely low and there is now impact on code size of ZFS is not enabled. I see no reason it can't be integrated.
I don't know where your original submission sat within the release cycle. If is was before the closing of the merge window then maybe Wolfgang will include it in the upcoming RC.
The only thing you need to do now is wait :)
Oh, and maybe prod the mailing list every now and again ;)
Regards,
Graeme

On Thursday 24 May 2012 22:11:46 Jorgen Lundman wrote:
U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.9 7/stage2/zfs-include/zfs.h
Released by Sun for GRUB under the license:
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
good ...
--- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,84 @@ +/*
- GRUB -- GRand Unified Bootloader
- Copyright (C) 1999,2000,2001,2002,2003,2004,2009
- Free Software Foundation, Inc.
- Copyright 2007 Sun Microsystems, Inc.
- GRUB is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 3 of the License, or
- (at your option) any later version.
... but it looks like you still have some GPL-3 gremlins lurking. this must be sorted out before we can consider the code for merging. i suspect simply changing "version 3" to "version 2" isn't the right answer, otherwise i wonder where you copied this code from such that i has "version 3" in the first place. -mike

... but it looks like you still have some GPL-3 gremlins lurking. this must be sorted out before we can consider the code for merging. i suspect simply changing "version 3" to "version 2" isn't the right answer, otherwise i wonder where you copied this code from such that i has "version 3" in the first place. -mike
The code submitted by Sun to GRUB is version 2, and you can see that in the first GRUB version with ZFS (0.97 - I posted the url earlier).
However, GRUB opted to go higher ("or at your option any later version") sometime before the GRUB 2 release. I took the latest version available or the sources at the time of porting.
However, there is only one functional source change between the versions, and that is adding ashift support, which was supplied as a patch.
That "ZFS" and "license" produces a knee-jerk reaction is a little tedious. It is not the problem everyone thinks, and I invite you to run ZFS on Linux native. http://zfsonlinux.org/
Lund

Hi Lund,
On Thu, Jul 19, 2012 at 10:14 AM, Jorgen Lundman lundman@lundman.net wrote:
... but it looks like you still have some GPL-3 gremlins lurking. this must be sorted out before we can consider the code for merging. i suspect simply changing "version 3" to "version 2" isn't the right answer, otherwise i wonder where you copied this code from such that i has "version 3" in the first place. -mike
Damb, I totally missed this - my appologies
The code submitted by Sun to GRUB is version 2, and you can see that in the first GRUB version with ZFS (0.97 - I posted the url earlier).
OK, so we have a specific version of GRUB where all ZFS code is GPL version 2 or later (i.e. no extraneous GPL version 3 license text)
However, GRUB opted to go higher ("or at your option any later version") sometime before the GRUB 2 release. I took the latest version available or the sources at the time of porting.
Unfortunately, what you need to do is pull from the version of GRUB prior to the license text changes and explicitly note the commit ID in the commit comment
However, there is only one functional source change between the versions, and that is adding ashift support, which was supplied as a patch.
Sorry, I don't know if quite follow - What I am guessing is the following sequence: - Intial ZFS port to GRUB under GPLv2 or later - GRUB changed to GPLv3 or later - ZFS ashift support added to GRUB
If this is the case, ashift support is GPLv3 and cannot be added to U-Boot unless the author of the patch agrees
That "ZFS" and "license" produces a knee-jerk reaction is a little tedious.
Yes, I agree that it is tedious, but it is a legal issue and we cannot simply side-step it
It is not the problem everyone thinks, and I invite you to run ZFS on Linux native. http://zfsonlinux.org/
See http://zfsonlinux.org/faq.html#WhatAboutTheLicensingIssue
ZFS on Linux bypasses the GPL restrictions by being implemented in a loadable module. And as such, using it will taint the kernel...
Regards,
Graeme

On Wednesday 18 July 2012 20:14:21 Jorgen Lundman wrote:
That "ZFS" and "license" produces a knee-jerk reaction is a little tedious.
i could care less about ZFS. i think you missed the entire point i highlighted: we cannot accept GPLv3 code. u-boot is currently GPLv2, so adding GPLv3 simply won't work. -mike

Mike Frysinger wrote:
i could care less about ZFS. i think you missed the entire point i highlighted: we cannot accept GPLv3 code. u-boot is currently GPLv2, so adding GPLv3 simply won't work. -mike
Very well, to attempt to go around this I would then have to use the original file, and port it forward to u-boot;
Original Sun Version; : solaris11/usr/src/grub/grub-0.97/stage2/zfs_fletcher.c
* the Free Software Foundation; either version 2 of the License, or
fletcher_2_native(const void *buf, uint64_t size, zio_cksum_t *zcp) fletcher_2_byteswap(const void *buf, uint64_t size, zio_cksum_t *zcp)
Current u-boot patch version;
* the Free Software Foundation; either version 3 of the License, or
fletcher_2(const void *buf, uint64_t size, grub_zfs_endian_t endian,zio_cksum_t *zcp) fletcher_4(const void *buf, uint64_t size, grub_zfs_endian_t endian, zio_cksum_t *zcp)
Which _effectively_ changes the "3" to "2", and renames two functions. The file only contains 2 functions, and refers to the "checksumming algorithm the user can chose to use on a filesystem".
Then, it is acceptable?

On Wednesday 18 July 2012 23:20:05 Jorgen Lundman wrote:
Mike Frysinger wrote:
i could care less about ZFS. i think you missed the entire point i highlighted: we cannot accept GPLv3 code. u-boot is currently GPLv2, so adding GPLv3 simply won't work.
Very well, to attempt to go around this I would then have to use the original file, and port it forward to u-boot;
i just picked out the first "version 3" i saw in your patch. going back to it and searching again shows more than just one file. -mike

Mike Frysinger wrote:
i just picked out the first "version 3" i saw in your patch. going back to it and searching again shows more than just one file. -mike
You are correct, 3 files to be precise:
zfs_fletcher.c zfs_lzjb.c zfs_sha256.c
but the other two are functionally not changed.
Syntactically, they have gone through a "unsigned char *" to "grub_uint8_t *" to "uint8_t *".
Not entirely sure how I would "re-write" my "uint8_t *" change to be more clearly from "unsigned char *", as opposed to "grub_uint_8 *", to show it is based on the version 2 of the license.

On Thursday 19 July 2012 00:51:23 Jorgen Lundman wrote:
Mike Frysinger wrote:
i just picked out the first "version 3" i saw in your patch. going back to it and searching again shows more than just one file.
You are correct, 3 files to be precise:
zfs_fletcher.c zfs_lzjb.c zfs_sha256.c
but the other two are functionally not changed.
Syntactically, they have gone through a "unsigned char *" to "grub_uint8_t *" to "uint8_t *".
that's fine
Not entirely sure how I would "re-write" my "uint8_t *" change to be more clearly from "unsigned char *", as opposed to "grub_uint_8 *", to show it is based on the version 2 of the license.
post a new patchset with the correct version, and detail these things justifying the content/license shifts -mike

Took a few hours to go back and start from GPL2 versions, but in the end we got rid of the grub_ defines as a bonus.
Patch to add ZFS filesystem support to u-boot, based on GRUB sources. Thank you for your patience.
Jorgen Lundman (1): Add ZFS support
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ doc/README.zfs | 30 + fs/Makefile | 1 + fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 88 ++ fs/zfs/zfs_lzjb.c | 97 ++ fs/zfs/zfs_sha256.c | 148 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 120 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 81 ++ include/zfs/dsl_dataset.h | 53 + include/zfs/dsl_dir.h | 49 + include/zfs/sa_impl.h | 35 + include/zfs/spa.h | 292 +++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 70 ++ include/zfs/zap_impl.h | 111 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 71 ++ include/zfs/zil.h | 57 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 50 + include/zfs_common.h | 109 ++ 30 files changed, 4730 insertions(+), 16 deletions(-) create mode 100644 common/cmd_zfs.c create mode 100644 doc/README.zfs copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h

U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Released by Sun for GRUB under the license: * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version.
GRUB official releases include ZFS in version: ftp://alpha.gnu.org/gnu/grub/grub-1.99~rc1.tar.gz
And patched against GRUB Bazaar repository for ashift fixes (4KB HDDs) more conveniently found at github: https://github.com/pendor/grub-zfs/commit/e7b6ef3ac3b9685ac4c394c897b1d4221b...
Signed-off-by: Jorgen Lundman lundman@lundman.net
---
v5: * Re-port based on GPLv2 license files, from original Sun GRUB-0.97 and patch forward. Headers remained untouched, minor style changes in some function. No logic changes required.
v4: * Add doc/README.zfs documentation
v3: * add missing patch revision history (this text) * Submitted as single patch per Wolfgang Denk instructions
v2: * Keep Makefile placement alphabetically sorted. * Clean ugly line breaks and indentation errors * Fix license corruption in fs/Makefile
--- --- Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ doc/README.zfs | 30 + fs/Makefile | 1 + fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 88 ++ fs/zfs/zfs_lzjb.c | 97 ++ fs/zfs/zfs_sha256.c | 148 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 120 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 81 ++ include/zfs/dsl_dataset.h | 53 + include/zfs/dsl_dir.h | 49 + include/zfs/sa_impl.h | 35 + include/zfs/spa.h | 292 +++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 70 ++ include/zfs/zap_impl.h | 111 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 71 ++ include/zfs/zil.h | 57 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 50 + include/zfs_common.h | 109 ++ 30 files changed, 4730 insertions(+), 16 deletions(-) create mode 100644 common/cmd_zfs.c create mode 100644 doc/README.zfs copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h
diff --git a/Makefile b/Makefile index 6e8b5a7..48800c3 100644 --- a/Makefile +++ b/Makefile @@ -247,7 +247,7 @@ endif LIBS += arch/$(ARCH)/lib/lib$(ARCH).o LIBS += fs/cramfs/libcramfs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o \ fs/reiserfs/libreiserfs.o fs/ext2/libext2fs.o fs/yaffs2/libyaffs2.o \ - fs/ubifs/libubifs.o + fs/ubifs/libubifs.o fs/zfs/libzfs.o LIBS += net/libnet.o LIBS += disk/libdisk.o LIBS += drivers/bios_emulator/libatibiosemu.o diff --git a/common/Makefile b/common/Makefile index 483eb4d..3d62775 100644 --- a/common/Makefile +++ b/common/Makefile @@ -162,6 +162,7 @@ endif COBJS-$(CONFIG_CMD_XIMG) += cmd_ximg.o COBJS-$(CONFIG_YAFFS2) += cmd_yaffs2.o COBJS-$(CONFIG_CMD_SPL) += cmd_spl.o +COBJS-$(CONFIG_CMD_ZFS) += cmd_zfs.o
# others ifdef CONFIG_DDR_SPD diff --git a/common/cmd_zfs.c b/common/cmd_zfs.c new file mode 100644 index 0000000..a6ea2c0 --- /dev/null +++ b/common/cmd_zfs.c @@ -0,0 +1,236 @@ +/* + * + * ZFS filesystem porting to Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + * + */ + +#include <common.h> +#include <part.h> +#include <config.h> +#include <command.h> +#include <image.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include <zfs_common.h> +#include <linux/stat.h> +#include <malloc.h> + +#if defined(CONFIG_CMD_USB) && defined(CONFIG_USB_STORAGE) +#include <usb.h> +#endif + +#if !defined(CONFIG_DOS_PARTITION) && !defined(CONFIG_EFI_PARTITION) +#error DOS or EFI partition support must be selected +#endif + +#define DOS_PART_MAGIC_OFFSET 0x1fe +#define DOS_FS_TYPE_OFFSET 0x36 +#define DOS_FS32_TYPE_OFFSET 0x52 + +static int do_zfs_load(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + char *filename = NULL; + char *ep; + int dev; + unsigned long part = 1; + ulong addr = 0; + ulong part_length; + disk_partition_t info; + char buf[12]; + unsigned long count; + const char *addr_str; + struct zfs_file zfile; + struct device_s vdev; + + if (argc < 3) + return CMD_RET_USAGE; + + count = 0; + addr = simple_strtoul(argv[3], NULL, 16); + filename = getenv("bootfile"); + switch (argc) { + case 3: + addr_str = getenv("loadaddr"); + if (addr_str != NULL) + addr = simple_strtoul(addr_str, NULL, 16); + else + addr = CONFIG_SYS_LOAD_ADDR; + + break; + case 4: + break; + case 5: + filename = argv[4]; + break; + case 6: + filename = argv[4]; + count = simple_strtoul(argv[5], NULL, 16); + break; + + default: + return cmd_usage(cmdtp); + } + + if (!filename) { + puts("** No boot file defined **\n"); + return 1; + } + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + if (zfs_dev_desc == NULL) { + printf("** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (part != 0) { + if (get_partition_info(zfs_dev_desc, part, &info)) { + printf("** Bad partition %lu **\n", part); + return 1; + } + + if (strncmp((char *)info.type, BOOT_PART_TYPE, + strlen(BOOT_PART_TYPE)) != 0) { + printf("** Invalid partition type "%s" (expect "" BOOT_PART_TYPE "")\n", + info.type); + return 1; + } + printf("Loading file "%s" " + "from %s device %d:%lu %s\n", + filename, argv[1], dev, part, info.name); + } else { + printf("Loading file "%s" from %s device %d\n", + filename, argv[1], dev); + } + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("**Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + memset(&zfile, 0, sizeof(zfile)); + zfile.device = &vdev; + if (zfs_open(&zfile, filename)) { + printf("** File not found %s\n", filename); + return 1; + } + + if ((count < zfile.size) && (count != 0)) + zfile.size = (uint64_t)count; + + if (zfs_read(&zfile, (char *)addr, zfile.size) != zfile.size) { + printf("** Unable to read "%s" from %s %d:%lu **\n", + filename, argv[1], dev, part); + zfs_close(&zfile); + return 1; + } + + zfs_close(&zfile); + + /* Loading ok, update default load address */ + load_addr = addr; + + printf("%llu bytes read\n", zfile.size); + sprintf(buf, "%llX", zfile.size); + setenv("filesize", buf); + + return 0; +} + + +int zfs_print(const char *entry, const struct zfs_dirhook_info *data) +{ + printf("%s %s\n", + data->dir ? "<DIR> " : " ", + entry); + return 0; /* 0 continue, 1 stop */ +} + + + +static int do_zfs_ls(cmd_tbl_t *cmdtp, int flag, int argc, char *argv[]) +{ + const char *filename = "/"; + int dev; + unsigned long part = 1; + char *ep; + int part_length; + struct device_s vdev; + + if (argc < 3) + return cmd_usage(cmdtp); + + dev = (int)simple_strtoul(argv[2], &ep, 16); + zfs_dev_desc = get_dev(argv[1], dev); + + if (zfs_dev_desc == NULL) { + printf("\n** Block device %s %d not supported\n", argv[1], dev); + return 1; + } + + if (*ep) { + if (*ep != ':') { + puts("\n** Invalid boot device, use `dev[:part]' **\n"); + return 1; + } + part = simple_strtoul(++ep, NULL, 16); + } + + if (argc == 4) + filename = argv[3]; + + part_length = zfs_set_blk_dev(zfs_dev_desc, part); + if (part_length == 0) { + printf("** Bad partition - %s %d:%lu **\n", argv[1], dev, part); + return 1; + } + + vdev.part_length = part_length; + + zfs_ls(&vdev, filename, + zfs_print); + + return 0; +} + + +U_BOOT_CMD(zfsls, 4, 1, do_zfs_ls, + "list files in a directory (default /)", + "<interface> <dev[:part]> [directory]\n" + " - list files from 'dev' on 'interface' in a '/DATASET/@/$dir/'"); + +U_BOOT_CMD(zfsload, 6, 0, do_zfs_load, + "load binary file from a ZFS filesystem", + "<interface> <dev[:part]> [addr] [filename] [bytes]\n" + " - load binary file '/DATASET/@/$dir/$file' from 'dev' on 'interface'\n" + " to address 'addr' from ZFS filesystem"); diff --git a/doc/README.zfs b/doc/README.zfs new file mode 100644 index 0000000..4b0e8a5 --- /dev/null +++ b/doc/README.zfs @@ -0,0 +1,30 @@ +This patch series adds support for ZFS listing and load to u-boot. + +To Enable zfs ls and load commands, modify the board specific config file with +#define CONFIG_CMD_ZFS + +Steps to test: + +1. After applying the patch, zfs specific commands can be seen + in the boot loader prompt using + UBOOT #help + + zfsload- load binary file from a ZFS file system + zfsls - list files in a directory (default /) + +2. To list the files in zfs pool, device or partition, execute + zfsls <interface> <dev[:part]> [POOL/@/dir/file] + For example: + UBOOT #zfsls mmc 0:5 /rpool/@/usr/bin/ + +3. To read and load a file from an ZFS formatted partition to RAM, execute + zfsload <interface> <dev[:part]> [addr] [filename] [bytes] + For example: + UBOOT #zfsload mmc 2:2 0x30007fc0 /rpool/@/boot/uImage + +References : + -- ZFS GRUB sources from Solaris GRUB-0.97 + -- GRUB Bazaar repository + +Jorgen Lundman <lundman at lundman.net> 2012. + diff --git a/fs/Makefile b/fs/Makefile index 22aad12..28da76e 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -30,6 +30,7 @@ subdirs-$(CONFIG_CMD_JFFS2) += jffs2 subdirs-$(CONFIG_CMD_REISER) += reiserfs subdirs-$(CONFIG_YAFFS2) += yaffs2 subdirs-$(CONFIG_CMD_UBIFS) += ubifs +subdirs-$(CONFIG_CMD_ZFS) += zfs
SUBDIRS := $(subdirs-y)
diff --git a/fs/Makefile b/fs/zfs/Makefile similarity index 56% copy from fs/Makefile copy to fs/zfs/Makefile index 22aad12..938fc5e 100644 --- a/fs/Makefile +++ b/fs/zfs/Makefile @@ -1,6 +1,6 @@ # -# (C) Copyright 2000-2006 -# Wolfgang Denk, DENX Software Engineering, wd@denx.de. +# (C) Copyright 2012 +# Jorgen Lundman <lundman at lundman.net> # # See file CREDITS for list of people who contributed to this # project. @@ -20,19 +20,28 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, # MA 02111-1307 USA # -#
-subdirs-$(CONFIG_CMD_CRAMFS) := cramfs -subdirs-$(CONFIG_CMD_EXT2) += ext2 -subdirs-$(CONFIG_CMD_FAT) += fat -subdirs-$(CONFIG_CMD_FDOS) += fdos -subdirs-$(CONFIG_CMD_JFFS2) += jffs2 -subdirs-$(CONFIG_CMD_REISER) += reiserfs -subdirs-$(CONFIG_YAFFS2) += yaffs2 -subdirs-$(CONFIG_CMD_UBIFS) += ubifs +include $(TOPDIR)/config.mk + +LIB = $(obj)libzfs.o + +AOBJS = +COBJS-$(CONFIG_CMD_ZFS) := dev.o zfs.o zfs_fletcher.o zfs_sha256.o zfs_lzjb.o + +SRCS := $(AOBJS:.o=.S) $(COBJS-y:.o=.c) +OBJS := $(addprefix $(obj),$(AOBJS) $(COBJS-y)) + + +all: $(LIB) $(AOBJS) + +$(LIB): $(obj).depend $(OBJS) + $(call cmd_link_o_target, $(OBJS)) + +######################################################################### + +# defines $(obj).depend target +include $(SRCTREE)/rules.mk
-SUBDIRS := $(subdirs-y) +sinclude $(obj).depend
-$(obj).depend all: - @for dir in $(SUBDIRS) ; do \ - $(MAKE) -C $$dir $@ ; done +######################################################################### diff --git a/fs/zfs/dev.c b/fs/zfs/dev.c new file mode 100644 index 0000000..d68372c --- /dev/null +++ b/fs/zfs/dev.c @@ -0,0 +1,137 @@ +/* + * + * based on code of fs/reiserfs/dev.c by + * + * (C) Copyright 2003 - 2004 + * Sysgo AG, <www.elinos.com>, Pavel Bartusek pba@sysgo.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#include <common.h> +#include <config.h> +#include <zfs_common.h> + +static block_dev_desc_t *zfs_block_dev_desc; +static disk_partition_t part_info; + +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part) +{ + zfs_block_dev_desc = rbdd; + + if (part == 0) { + /* disk doesn't use partition table */ + part_info.start = 0; + part_info.size = rbdd->lba; + part_info.blksz = rbdd->blksz; + } else { + if (get_partition_info(zfs_block_dev_desc, part, &part_info)) + return 0; + } + + return part_info.size; +} + +/* err */ +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf) +{ + short sec_buffer[SECTOR_SIZE/sizeof(short)]; + char *sec_buf = (char *)sec_buffer; + unsigned block_len; + + /* + * Check partition boundaries + */ + if ((sector < 0) || + ((sector + ((byte_offset + byte_len - 1) >> SECTOR_BITS)) >= + part_info.size)) { + /* errnum = ERR_OUTSIDE_PART; */ + printf(" ** zfs_devread() read outside partition sector %d\n", sector); + return 1; + } + + /* + * Get the read to the beginning of a partition. + */ + sector += byte_offset >> SECTOR_BITS; + byte_offset &= SECTOR_SIZE - 1; + + debug(" <%d, %d, %d>\n", sector, byte_offset, byte_len); + + if (zfs_block_dev_desc == NULL) { + printf("** Invalid Block Device Descriptor (NULL)\n"); + return 1; + } + + if (byte_offset != 0) { + /* read first part which isn't aligned with start of sector */ + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error **\n"); + return 1; + } + memcpy(buf, sec_buf + byte_offset, + min(SECTOR_SIZE - byte_offset, byte_len)); + buf += min(SECTOR_SIZE - byte_offset, byte_len); + byte_len -= min(SECTOR_SIZE - byte_offset, byte_len); + sector++; + } + + if (byte_len == 0) + return 0; + + /* read sector aligned part */ + block_len = byte_len & ~(SECTOR_SIZE - 1); + + if (block_len == 0) { + u8 p[SECTOR_SIZE]; + + block_len = SECTOR_SIZE; + zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + 1, (unsigned long *)p); + memcpy(buf, p, byte_len); + return 0; + } + + if (zfs_block_dev_desc->block_read(zfs_block_dev_desc->dev, + part_info.start + sector, + block_len / SECTOR_SIZE, + (unsigned long *) buf) != + block_len / SECTOR_SIZE) { + printf(" ** zfs_devread() read error - block\n"); + return 1; + } + + block_len = byte_len & ~(SECTOR_SIZE - 1); + buf += block_len; + byte_len -= block_len; + sector += block_len / SECTOR_SIZE; + + if (byte_len != 0) { + /* read rest of data which are not in whole sector */ + if (zfs_block_dev_desc-> + block_read(zfs_block_dev_desc->dev, + part_info.start + sector, 1, + (unsigned long *) sec_buf) != 1) { + printf(" ** zfs_devread() read error - last part\n"); + return 1; + } + memcpy(buf, sec_buf, byte_len); + } + return 0; +} diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c new file mode 100644 index 0000000..cdf8950 --- /dev/null +++ b/fs/zfs/zfs.c @@ -0,0 +1,2396 @@ +/* + * + * ZFS filesystem ported to u-boot by + * Jorgen Lundman <lundman at lundman.net> + * + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 + * Free Software Foundation, Inc. + * Copyright 2004 Sun Microsystems, Inc. + * + * GRUB is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * GRUB is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with GRUB. If not, see http://www.gnu.org/licenses/. + * + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +block_dev_desc_t *zfs_dev_desc; + +/* + * The zfs plug-in routines for GRUB are: + * + * zfs_mount() - locates a valid uberblock of the root pool and reads + * in its MOS at the memory address MOS. + * + * zfs_open() - locates a plain file object by following the MOS + * and places its dnode at the memory address DNODE. + * + * zfs_read() - read in the data blocks pointed by the DNODE. + * + */ + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/sa_impl.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + + +#define ZPOOL_PROP_BOOTFS "bootfs" + + +/* + * For nvlist manipulation. (from nvpair.h) + */ +#define NV_ENCODE_NATIVE 0 +#define NV_ENCODE_XDR 1 +#define NV_BIG_ENDIAN 0 +#define NV_LITTLE_ENDIAN 1 +#define DATA_TYPE_UINT64 8 +#define DATA_TYPE_STRING 9 +#define DATA_TYPE_NVLIST 19 +#define DATA_TYPE_NVLIST_ARRAY 20 + + +/* + * Macros to get fields in a bp or DVA. + */ +#define P2PHASE(x, align) ((x) & ((align) - 1)) +#define DVA_OFFSET_TO_PHYS_SECTOR(offset) \ + ((offset + VDEV_LABEL_START_SIZE) >> SPA_MINBLOCKSHIFT) + +/* + * return x rounded down to an align boundary + * eg, P2ALIGN(1200, 1024) == 1024 (1*align) + * eg, P2ALIGN(1024, 1024) == 1024 (1*align) + * eg, P2ALIGN(0x1234, 0x100) == 0x1200 (0x12*align) + * eg, P2ALIGN(0x5600, 0x100) == 0x5600 (0x56*align) + */ +#define P2ALIGN(x, align) ((x) & -(align)) + +/* + * FAT ZAP data structures + */ +#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ +#define ZAP_HASH_IDX(hash, n) (((n) == 0) ? 0 : ((hash) >> (64 - (n)))) +#define CHAIN_END 0xffff /* end of the chunk chain */ + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +#define ZAP_LEAF_HASH_SHIFT(bs) (bs - 5) +#define ZAP_LEAF_HASH_NUMENTRIES(bs) (1 << ZAP_LEAF_HASH_SHIFT(bs)) +#define LEAF_HASH(bs, h) \ + ((ZAP_LEAF_HASH_NUMENTRIES(bs)-1) & \ + ((h) >> (64 - ZAP_LEAF_HASH_SHIFT(bs)-l->l_hdr.lh_prefix_len))) + +/* + * The amount of space available for chunks is: + * block size shift - hash entry size (2) * number of hash + * entries - header space (2*chunksize) + */ +#define ZAP_LEAF_NUMCHUNKS(bs) \ + (((1<<bs) - 2*ZAP_LEAF_HASH_NUMENTRIES(bs)) / \ + ZAP_LEAF_CHUNKSIZE - 2) + +/* + * The chunks start immediately after the hash table. The end of the + * hash table is at l_hash + HASH_NUMENTRIES, which we simply cast to a + * chunk_t. + */ +#define ZAP_LEAF_CHUNK(l, bs, idx) \ + ((zap_leaf_chunk_t *)(l->l_hash + ZAP_LEAF_HASH_NUMENTRIES(bs)))[idx] +#define ZAP_LEAF_ENTRY(l, bs, idx) (&ZAP_LEAF_CHUNK(l, bs, idx).l_entry) + + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + + + +typedef int zfs_decomp_func_t(void *s_start, void *d_start, + uint32_t s_len, uint32_t d_len); +typedef struct decomp_entry { + char *name; + zfs_decomp_func_t *decomp_func; +} decomp_entry_t; + +typedef struct dnode_end { + dnode_phys_t dn; + zfs_endian_t endian; +} dnode_end_t; + +struct zfs_data { + /* cache for a file block of the currently zfs_open()-ed file */ + char *file_buf; + uint64_t file_start; + uint64_t file_end; + + /* XXX: ashift is per vdev, not per pool. We currently only ever touch + * a single vdev, but when/if raid-z or stripes are supported, this + * may need revision. + */ + uint64_t vdev_ashift; + uint64_t label_txg; + uint64_t pool_guid; + + /* cache for a dnode block */ + dnode_phys_t *dnode_buf; + dnode_phys_t *dnode_mdn; + uint64_t dnode_start; + uint64_t dnode_end; + zfs_endian_t dnode_endian; + + uberblock_t current_uberblock; + + dnode_end_t mos; + dnode_end_t mdn; + dnode_end_t dnode; + + uint64_t vdev_phys_sector; + + int (*userhook)(const char *, const struct zfs_dirhook_info *); + struct zfs_dirhook_info *dirinfo; + +}; + + + + +static int +zlib_decompress(void *s, void *d, + uint32_t slen, uint32_t dlen) +{ + if (zlib_decompress(s, d, slen, dlen) < 0) + return ZFS_ERR_BAD_FS; + return ZFS_ERR_NONE; +} + +static decomp_entry_t decomp_table[ZIO_COMPRESS_FUNCTIONS] = { + {"inherit", NULL}, /* ZIO_COMPRESS_INHERIT */ + {"on", lzjb_decompress}, /* ZIO_COMPRESS_ON */ + {"off", NULL}, /* ZIO_COMPRESS_OFF */ + {"lzjb", lzjb_decompress}, /* ZIO_COMPRESS_LZJB */ + {"empty", NULL}, /* ZIO_COMPRESS_EMPTY */ + {"gzip-1", zlib_decompress}, /* ZIO_COMPRESS_GZIP1 */ + {"gzip-2", zlib_decompress}, /* ZIO_COMPRESS_GZIP2 */ + {"gzip-3", zlib_decompress}, /* ZIO_COMPRESS_GZIP3 */ + {"gzip-4", zlib_decompress}, /* ZIO_COMPRESS_GZIP4 */ + {"gzip-5", zlib_decompress}, /* ZIO_COMPRESS_GZIP5 */ + {"gzip-6", zlib_decompress}, /* ZIO_COMPRESS_GZIP6 */ + {"gzip-7", zlib_decompress}, /* ZIO_COMPRESS_GZIP7 */ + {"gzip-8", zlib_decompress}, /* ZIO_COMPRESS_GZIP8 */ + {"gzip-9", zlib_decompress}, /* ZIO_COMPRESS_GZIP9 */ +}; + + + +static int zio_read_data(blkptr_t *bp, zfs_endian_t endian, + void *buf, struct zfs_data *data); + +static int +zio_read(blkptr_t *bp, zfs_endian_t endian, void **buf, + size_t *size, struct zfs_data *data); + +/* + * Our own version of log2(). Same thing as highbit()-1. + */ +static int +zfs_log2(uint64_t num) +{ + int i = 0; + + while (num > 1) { + i++; + num = num >> 1; + } + + return i; +} + + +/* Checksum Functions */ +static void +zio_checksum_off(const void *buf __attribute__ ((unused)), + uint64_t size __attribute__ ((unused)), + zfs_endian_t endian __attribute__ ((unused)), + zio_cksum_t *zcp) +{ + ZIO_SET_CHECKSUM(zcp, 0, 0, 0, 0); +} + +/* Checksum Table and Values */ +static zio_checksum_info_t zio_checksum_table[ZIO_CHECKSUM_FUNCTIONS] = { + {NULL, 0, 0, "inherit"}, + {NULL, 0, 0, "on"}, + {zio_checksum_off, 0, 0, "off"}, + {zio_checksum_SHA256, 1, 1, "label"}, + {zio_checksum_SHA256, 1, 1, "gang_header"}, + {NULL, 0, 0, "zilog"}, + {fletcher_2_endian, 0, 0, "fletcher2"}, + {fletcher_4_endian, 1, 0, "fletcher4"}, + {zio_checksum_SHA256, 1, 0, "SHA256"}, + {NULL, 0, 0, "zilog2"}, +}; + +/* + * zio_checksum_verify: Provides support for checksum verification. + * + * Fletcher2, Fletcher4, and SHA256 are supported. + * + */ +static int +zio_checksum_verify(zio_cksum_t zc, uint32_t checksum, + zfs_endian_t endian, char *buf, int size) +{ + zio_eck_t *zec = (zio_eck_t *) (buf + size) - 1; + zio_checksum_info_t *ci = &zio_checksum_table[checksum]; + zio_cksum_t actual_cksum, expected_cksum; + + if (checksum >= ZIO_CHECKSUM_FUNCTIONS || ci->ci_func == NULL) { + printf("zfs unknown checksum function %d\n", checksum); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (ci->ci_eck) { + expected_cksum = zec->zec_cksum; + zec->zec_cksum = zc; + ci->ci_func(buf, size, endian, &actual_cksum); + zec->zec_cksum = expected_cksum; + zc = expected_cksum; + } else { + ci->ci_func(buf, size, endian, &actual_cksum); + } + + if ((actual_cksum.zc_word[0] != zc.zc_word[0]) + || (actual_cksum.zc_word[1] != zc.zc_word[1]) + || (actual_cksum.zc_word[2] != zc.zc_word[2]) + || (actual_cksum.zc_word[3] != zc.zc_word[3])) { + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * vdev_uberblock_compare takes two uberblock structures and returns an integer + * indicating the more recent of the two. + * Return Value = 1 if ub2 is more recent + * Return Value = -1 if ub1 is more recent + * The most recent uberblock is determined using its transaction number and + * timestamp. The uberblock with the highest transaction number is + * considered "newer". If the transaction numbers of the two blocks match, the + * timestamps are compared to determine the "newer" of the two. + */ +static int +vdev_uberblock_compare(uberblock_t *ub1, uberblock_t *ub2) +{ + zfs_endian_t ub1_endian, ub2_endian; + if (zfs_to_cpu64(ub1->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub1_endian = LITTLE_ENDIAN; + else + ub1_endian = BIG_ENDIAN; + if (zfs_to_cpu64(ub2->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC) + ub2_endian = LITTLE_ENDIAN; + else + ub2_endian = BIG_ENDIAN; + + if (zfs_to_cpu64(ub1->ub_txg, ub1_endian) + < zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return -1; + if (zfs_to_cpu64(ub1->ub_txg, ub1_endian) + > zfs_to_cpu64(ub2->ub_txg, ub2_endian)) + return 1; + + if (zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + < zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return -1; + if (zfs_to_cpu64(ub1->ub_timestamp, ub1_endian) + > zfs_to_cpu64(ub2->ub_timestamp, ub2_endian)) + return 1; + + return 0; +} + +/* + * Three pieces of information are needed to verify an uberblock: the magic + * number, the version number, and the checksum. + * + * Currently Implemented: version number, magic number, label txg + * Need to Implement: checksum + * + */ +static int +uberblock_verify(uberblock_t *uber, int offset, struct zfs_data *data) +{ + int err; + zfs_endian_t endian = UNKNOWN_ENDIAN; + zio_cksum_t zc; + + if (uber->ub_txg < data->label_txg) { + debug("ignoring partially written label: uber_txg < label_txg %llu %llu\n", + uber->ub_txg, data->label_txg); + return ZFS_ERR_BAD_FS; + } + + if (zfs_to_cpu64(uber->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + && zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) > 0 + && zfs_to_cpu64(uber->ub_version, LITTLE_ENDIAN) <= SPA_VERSION) + endian = LITTLE_ENDIAN; + + if (zfs_to_cpu64(uber->ub_magic, BIG_ENDIAN) == UBERBLOCK_MAGIC + && zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) > 0 + && zfs_to_cpu64(uber->ub_version, BIG_ENDIAN) <= SPA_VERSION) + endian = BIG_ENDIAN; + + if (endian == UNKNOWN_ENDIAN) { + printf("invalid uberblock magic\n"); + return ZFS_ERR_BAD_FS; + } + + memset(&zc, 0, sizeof(zc)); + zc.zc_word[0] = cpu_to_zfs64(offset, endian); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_LABEL, endian, + (char *) uber, UBERBLOCK_SIZE(data->vdev_ashift)); + + if (!err) { + /* Check that the data pointed by the rootbp is usable. */ + void *osp = NULL; + size_t ospsize; + err = zio_read(&uber->ub_rootbp, endian, &osp, &ospsize, data); + free(osp); + + if (!err && ospsize < OBJSET_PHYS_SIZE_V14) { + printf("uberblock rootbp points to invalid data\n"); + return ZFS_ERR_BAD_FS; + } + } + + return err; +} + +/* + * Find the best uberblock. + * Return: + * Success - Pointer to the best uberblock. + * Failure - NULL + */ +static uberblock_t *find_bestub(char *ub_array, struct zfs_data *data) +{ + const uint64_t sector = data->vdev_phys_sector; + uberblock_t *ubbest = NULL; + uberblock_t *ubnext; + unsigned int i, offset, pickedub = 0; + int err = ZFS_ERR_NONE; + + const unsigned int UBCOUNT = UBERBLOCK_COUNT(data->vdev_ashift); + const uint64_t UBBYTES = UBERBLOCK_SIZE(data->vdev_ashift); + + for (i = 0; i < UBCOUNT; i++) { + ubnext = (uberblock_t *) (i * UBBYTES + ub_array); + offset = (sector << SPA_MINBLOCKSHIFT) + VDEV_PHYS_SIZE + (i * UBBYTES); + + err = uberblock_verify(ubnext, offset, data); + if (err) + continue; + + if (ubbest == NULL || vdev_uberblock_compare(ubnext, ubbest) > 0) { + ubbest = ubnext; + pickedub = i; + } + } + + if (ubbest) + debug("zfs Found best uberblock at idx %d, txg %llu\n", + pickedub, (unsigned long long) ubbest->ub_txg); + + return ubbest; +} + +static inline size_t +get_psize(blkptr_t *bp, zfs_endian_t endian) +{ + return (((zfs_to_cpu64((bp)->blk_prop, endian) >> 16) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT; +} + +static uint64_t +dva_get_offset(dva_t *dva, zfs_endian_t endian) +{ + return zfs_to_cpu64((dva)->dva_word[1], + endian) << SPA_MINBLOCKSHIFT; +} + +/* + * Read a block of data based on the gang block address dva, + * and put its data in buf. + * + */ +static int +zio_read_gang(blkptr_t *bp, zfs_endian_t endian, dva_t *dva, void *buf, + struct zfs_data *data) +{ + zio_gbh_phys_t *zio_gb; + uint64_t offset, sector; + unsigned i; + int err; + zio_cksum_t zc; + + memset(&zc, 0, sizeof(zc)); + + zio_gb = malloc(SPA_GANGBLOCKSIZE); + if (!zio_gb) + return ZFS_ERR_OUT_OF_MEMORY; + + offset = dva_get_offset(dva, endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + /* read in the gang block header */ + err = zfs_devread(sector, 0, SPA_GANGBLOCKSIZE, (char *) zio_gb); + + if (err) { + free(zio_gb); + return err; + } + + /* XXX */ + /* self checksuming the gang block header */ + ZIO_SET_CHECKSUM(&zc, DVA_GET_VDEV(dva), + dva_get_offset(dva, endian), bp->blk_birth, 0); + err = zio_checksum_verify(zc, ZIO_CHECKSUM_GANG_HEADER, endian, + (char *) zio_gb, SPA_GANGBLOCKSIZE); + if (err) { + free(zio_gb); + return err; + } + + endian = (zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + + for (i = 0; i < SPA_GBH_NBLKPTRS; i++) { + if (zio_gb->zg_blkptr[i].blk_birth == 0) + continue; + + err = zio_read_data(&zio_gb->zg_blkptr[i], endian, buf, data); + if (err) { + free(zio_gb); + return err; + } + buf = (char *) buf + get_psize(&zio_gb->zg_blkptr[i], endian); + } + free(zio_gb); + return ZFS_ERR_NONE; +} + +/* + * Read in a block of raw data to buf. + */ +static int +zio_read_data(blkptr_t *bp, zfs_endian_t endian, void *buf, + struct zfs_data *data) +{ + int i, psize; + int err = ZFS_ERR_NONE; + + psize = get_psize(bp, endian); + + /* pick a good dva from the block pointer */ + for (i = 0; i < SPA_DVAS_PER_BP; i++) { + uint64_t offset, sector; + + if (bp->blk_dva[i].dva_word[0] == 0 && bp->blk_dva[i].dva_word[1] == 0) + continue; + + if ((zfs_to_cpu64(bp->blk_dva[i].dva_word[1], endian)>>63) & 1) { + err = zio_read_gang(bp, endian, &bp->blk_dva[i], buf, data); + } else { + /* read in a data block */ + offset = dva_get_offset(&bp->blk_dva[i], endian); + sector = DVA_OFFSET_TO_PHYS_SECTOR(offset); + + err = zfs_devread(sector, 0, psize, buf); + } + + if (!err) { + /*Check the underlying checksum before we rule this DVA as "good"*/ + uint32_t checkalgo = (zfs_to_cpu64((bp)->blk_prop, endian) >> 40) & 0xff; + + err = zio_checksum_verify(bp->blk_cksum, checkalgo, endian, buf, psize); + if (!err) + return ZFS_ERR_NONE; + } + + /* If read failed or checksum bad, reset the error. Hopefully we've got some more DVA's to try.*/ + } + + if (!err) { + printf("couldn't find a valid DVA\n"); + err = ZFS_ERR_BAD_FS; + } + + return err; +} + +/* + * Read in a block of data, verify its checksum, decompress if needed, + * and put the uncompressed data in buf. + */ +static int +zio_read(blkptr_t *bp, zfs_endian_t endian, void **buf, + size_t *size, struct zfs_data *data) +{ + size_t lsize, psize; + unsigned int comp; + char *compbuf = NULL; + int err; + + *buf = NULL; + + comp = (zfs_to_cpu64((bp)->blk_prop, endian)>>32) & 0xff; + lsize = (BP_IS_HOLE(bp) ? 0 : + (((zfs_to_cpu64((bp)->blk_prop, endian) & 0xffff) + 1) + << SPA_MINBLOCKSHIFT)); + psize = get_psize(bp, endian); + + if (size) + *size = lsize; + + if (comp >= ZIO_COMPRESS_FUNCTIONS) { + printf("compression algorithm %u not supported\n", (unsigned int) comp); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF && decomp_table[comp].decomp_func == NULL) { + printf("compression algorithm %s not supported\n", decomp_table[comp].name); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + if (comp != ZIO_COMPRESS_OFF) { + compbuf = malloc(psize); + if (!compbuf) + return ZFS_ERR_OUT_OF_MEMORY; + } else { + compbuf = *buf = malloc(lsize); + } + + err = zio_read_data(bp, endian, compbuf, data); + if (err) { + free(compbuf); + *buf = NULL; + return err; + } + + if (comp != ZIO_COMPRESS_OFF) { + *buf = malloc(lsize); + if (!*buf) { + free(compbuf); + return ZFS_ERR_OUT_OF_MEMORY; + } + + err = decomp_table[comp].decomp_func(compbuf, *buf, psize, lsize); + free(compbuf); + if (err) { + free(*buf); + *buf = NULL; + return err; + } + } + + return ZFS_ERR_NONE; +} + +/* + * Get the block from a block id. + * push the block onto the stack. + * + */ +static int +dmu_read(dnode_end_t *dn, uint64_t blkid, void **buf, + zfs_endian_t *endian_out, struct zfs_data *data) +{ + int idx, level; + blkptr_t *bp_array = dn->dn.dn_blkptr; + int epbs = dn->dn.dn_indblkshift - SPA_BLKPTRSHIFT; + blkptr_t *bp; + void *tmpbuf = 0; + zfs_endian_t endian; + int err = ZFS_ERR_NONE; + + bp = malloc(sizeof(blkptr_t)); + if (!bp) + return ZFS_ERR_OUT_OF_MEMORY; + + endian = dn->endian; + for (level = dn->dn.dn_nlevels - 1; level >= 0; level--) { + idx = (blkid >> (epbs * level)) & ((1 << epbs) - 1); + *bp = bp_array[idx]; + if (bp_array != dn->dn.dn_blkptr) { + free(bp_array); + bp_array = 0; + } + + if (BP_IS_HOLE(bp)) { + size_t size = zfs_to_cpu16(dn->dn.dn_datablkszsec, + dn->endian) + << SPA_MINBLOCKSHIFT; + *buf = malloc(size); + if (*buf) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + memset(*buf, 0, size); + endian = (zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + if (level == 0) { + err = zio_read(bp, endian, buf, 0, data); + endian = (zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + break; + } + err = zio_read(bp, endian, &tmpbuf, 0, data); + endian = (zfs_to_cpu64(bp->blk_prop, endian) >> 63) & 1; + if (err) + break; + bp_array = tmpbuf; + } + if (bp_array != dn->dn.dn_blkptr) + free(bp_array); + if (endian_out) + *endian_out = endian; + + free(bp); + return err; +} + +/* + * mzap_lookup: Looks up property described by "name" and returns the value + * in "value". + */ +static int +mzap_lookup(mzap_phys_t *zapobj, zfs_endian_t endian, + int objsize, char *name, uint64_t * value) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (strcmp(mzap_ent[i].mze_name, name) == 0) { + *value = zfs_to_cpu64(mzap_ent[i].mze_value, endian); + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + +static int +mzap_iterate(mzap_phys_t *zapobj, zfs_endian_t endian, int objsize, + int (*hook)(const char *name, + uint64_t val, + struct zfs_data *data), + struct zfs_data *data) +{ + int i, chunks; + mzap_ent_phys_t *mzap_ent = zapobj->mz_chunk; + + chunks = objsize / MZAP_ENT_LEN - 1; + for (i = 0; i < chunks; i++) { + if (hook(mzap_ent[i].mze_name, + zfs_to_cpu64(mzap_ent[i].mze_value, endian), + data)) + return 1; + } + + return 0; +} + +static uint64_t +zap_hash(uint64_t salt, const char *name) +{ + static uint64_t table[256]; + const uint8_t *cp; + uint8_t c; + uint64_t crc = salt; + + if (table[128] == 0) { + uint64_t *ct; + int i, j; + for (i = 0; i < 256; i++) { + for (ct = table + i, *ct = i, j = 8; j > 0; j--) + *ct = (*ct >> 1) ^ (-(*ct & 1) & ZFS_CRC64_POLY); + } + } + + for (cp = (const uint8_t *) name; (c = *cp) != '\0'; cp++) + crc = (crc >> 8) ^ table[(crc ^ c) & 0xFF]; + + /* + * Only use 28 bits, since we need 4 bits in the cookie for the + * collision differentiator. We MUST use the high bits, since + * those are the onces that we first pay attention to when + * chosing the bucket. + */ + crc &= ~((1ULL << (64 - ZAP_HASHBITS)) - 1); + + return crc; +} + +/* + * Only to be used on 8-bit arrays. + * array_len is actual len in bytes (not encoded le_value_length). + * buf is null-terminated. + */ +/* XXX */ +static int +zap_leaf_array_equal(zap_leaf_phys_t *l, zfs_endian_t endian, + int blksft, int chunk, int array_len, const char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + return 0; + + if (memcmp(la->la_array, buf + bseen, toread) != 0) + break; + chunk = zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return (bseen == array_len); +} + +/* XXX */ +static int +zap_leaf_array_get(zap_leaf_phys_t *l, zfs_endian_t endian, int blksft, + int chunk, int array_len, char *buf) +{ + int bseen = 0; + + while (bseen < array_len) { + struct zap_leaf_array *la = &ZAP_LEAF_CHUNK(l, blksft, chunk).l_array; + int toread = MIN(array_len - bseen, ZAP_LEAF_ARRAY_BYTES); + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) + /* Don't use errno because this error is to be ignored. */ + return ZFS_ERR_BAD_FS; + + memcpy(buf + bseen, la->la_array, toread); + chunk = zfs_to_cpu16(la->la_next, endian); + bseen += toread; + } + return ZFS_ERR_NONE; +} + + +/* + * Given a zap_leaf_phys_t, walk thru the zap leaf chunks to get the + * value for the property "name". + * + */ +/* XXX */ +static int +zap_leaf_lookup(zap_leaf_phys_t *l, zfs_endian_t endian, + int blksft, uint64_t h, + const char *name, uint64_t *value) +{ + uint16_t chunk; + struct zap_leaf_entry *le; + + /* Verify if this is a valid leaf block */ + if (zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + printf("invalid leaf type\n"); + return ZFS_ERR_BAD_FS; + } + if (zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + printf("invalid leaf magic\n"); + return ZFS_ERR_BAD_FS; + } + + for (chunk = zfs_to_cpu16(l->l_hash[LEAF_HASH(blksft, h)], endian); + chunk != CHAIN_END; chunk = le->le_next) { + + if (chunk >= ZAP_LEAF_NUMCHUNKS(blksft)) { + printf("invalid chunk number\n"); + return ZFS_ERR_BAD_FS; + } + + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) { + printf("invalid chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + + if (zfs_to_cpu64(le->le_hash, endian) != h) + continue; + + if (zap_leaf_array_equal(l, endian, blksft, + zfs_to_cpu16(le->le_name_chunk, endian), + zfs_to_cpu16(le->le_name_length, endian), + name)) { + struct zap_leaf_array *la; + + if (le->le_int_size != 8 || le->le_value_length != 1) { + printf("invalid leaf chunk entry\n"); + return ZFS_ERR_BAD_FS; + } + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + + *value = be64_to_cpu(la->la_array64); + + return ZFS_ERR_NONE; + } + } + + printf("couldn't find '%s'\n", name); + return ZFS_ERR_FILE_NOT_FOUND; +} + + +/* Verify if this is a fat zap header block */ +static int +zap_verify(zap_phys_t *zap) +{ + if (zap->zap_magic != (uint64_t) ZAP_MAGIC) { + printf("bad ZAP magic\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_flags != 0) { + printf("bad ZAP flags\n"); + return ZFS_ERR_BAD_FS; + } + + if (zap->zap_salt == 0) { + printf("bad ZAP salt\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Fat ZAP lookup + * + */ +/* XXX */ +static int +fzap_lookup(dnode_end_t *zap_dnode, zap_phys_t *zap, + char *name, uint64_t *value, struct zfs_data *data) +{ + void *l; + uint64_t hash, idx, blkid; + int blksft = zfs_log2(zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + zfs_endian_t leafendian; + + err = zap_verify(zap); + if (err) + return err; + + hash = zap_hash(zap->zap_salt, name); + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + idx = ZAP_HASH_IDX(hash, zap->zap_ptrtbl.zt_shift); + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return ZFS_ERR_BAD_FS; + } + err = dmu_read(zap_dnode, blkid, &l, &leafendian, data); + if (err) + return err; + + err = zap_leaf_lookup(l, leafendian, blksft, hash, name, value); + free(l); + return err; +} + +/* XXX */ +static int +fzap_iterate(dnode_end_t *zap_dnode, zap_phys_t *zap, + int (*hook)(const char *name, + uint64_t val, + struct zfs_data *data), + struct zfs_data *data) +{ + zap_leaf_phys_t *l; + void *l_in; + uint64_t idx, blkid; + uint16_t chunk; + int blksft = zfs_log2(zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << DNODE_SHIFT); + int err; + zfs_endian_t endian; + + if (zap_verify(zap)) + return 0; + + /* get block id from index */ + if (zap->zap_ptrtbl.zt_numblks != 0) { + printf("external pointer tables not supported\n"); + return 0; + } + /* Get the leaf block */ + if ((1U << blksft) < sizeof(zap_leaf_phys_t)) { + printf("ZAP leaf is too small\n"); + return 0; + } + for (idx = 0; idx < zap->zap_ptrtbl.zt_numblks; idx++) { + blkid = ((uint64_t *) zap)[idx + (1 << (blksft - 3 - 1))]; + + err = dmu_read(zap_dnode, blkid, &l_in, &endian, data); + l = l_in; + if (err) + continue; + + /* Verify if this is a valid leaf block */ + if (zfs_to_cpu64(l->l_hdr.lh_block_type, endian) != ZBT_LEAF) { + free(l); + continue; + } + if (zfs_to_cpu32(l->l_hdr.lh_magic, endian) != ZAP_LEAF_MAGIC) { + free(l); + continue; + } + + for (chunk = 0; chunk < ZAP_LEAF_NUMCHUNKS(blksft); chunk++) { + char *buf; + struct zap_leaf_array *la; + struct zap_leaf_entry *le; + uint64_t val; + le = ZAP_LEAF_ENTRY(l, blksft, chunk); + + /* Verify the chunk entry */ + if (le->le_type != ZAP_CHUNK_ENTRY) + continue; + + buf = malloc(zfs_to_cpu16(le->le_name_length, endian) + + 1); + if (zap_leaf_array_get(l, endian, blksft, le->le_name_chunk, + le->le_name_length, buf)) { + free(buf); + continue; + } + buf[le->le_name_length] = 0; + + if (le->le_int_size != 8 + || zfs_to_cpu16(le->le_value_length, endian) != 1) + continue; + + /* get the uint64_t property value */ + la = &ZAP_LEAF_CHUNK(l, blksft, le->le_value_chunk).l_array; + val = be64_to_cpu(la->la_array64); + if (hook(buf, val, data)) + return 1; + free(buf); + } + } + return 0; +} + + +/* + * Read in the data of a zap object and find the value for a matching + * property name. + * + */ +static int +zap_lookup(dnode_end_t *zap_dnode, char *name, uint64_t *val, + struct zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, + zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return err; + block_type = zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + err = (mzap_lookup(zapbuf, endian, size, name, val)); + free(zapbuf); + return err; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + err = (fzap_lookup(zap_dnode, zapbuf, name, val, data)); + free(zapbuf); + return err; + } + + printf("unknown ZAP type\n"); + return ZFS_ERR_BAD_FS; +} + +static int +zap_iterate(dnode_end_t *zap_dnode, + int (*hook)(const char *name, uint64_t val, + struct zfs_data *data), + struct zfs_data *data) +{ + uint64_t block_type; + int size; + void *zapbuf; + int err; + int ret; + zfs_endian_t endian; + + /* Read in the first block of the zap object data. */ + size = zfs_to_cpu16(zap_dnode->dn.dn_datablkszsec, zap_dnode->endian) << SPA_MINBLOCKSHIFT; + err = dmu_read(zap_dnode, 0, &zapbuf, &endian, data); + if (err) + return 0; + block_type = zfs_to_cpu64(*((uint64_t *) zapbuf), endian); + + if (block_type == ZBT_MICRO) { + ret = mzap_iterate(zapbuf, endian, size, hook, data); + free(zapbuf); + return ret; + } else if (block_type == ZBT_HEADER) { + /* this is a fat zap */ + ret = fzap_iterate(zap_dnode, zapbuf, hook, data); + free(zapbuf); + return ret; + } + printf("unknown ZAP type\n"); + return 0; +} + + +/* + * Get the dnode of an object number from the metadnode of an object set. + * + * Input + * mdn - metadnode to get the object dnode + * objnum - object number for the object dnode + * buf - data buffer that holds the returning dnode + */ +static int +dnode_get(dnode_end_t *mdn, uint64_t objnum, uint8_t type, + dnode_end_t *buf, struct zfs_data *data) +{ + uint64_t blkid, blksz; /* the block id this object dnode is in */ + int epbs; /* shift of number of dnodes in a block */ + int idx; /* index within a block */ + void *dnbuf; + int err; + zfs_endian_t endian; + + blksz = zfs_to_cpu16(mdn->dn.dn_datablkszsec, + mdn->endian) << SPA_MINBLOCKSHIFT; + + epbs = zfs_log2(blksz) - DNODE_SHIFT; + blkid = objnum >> epbs; + idx = objnum & ((1 << epbs) - 1); + + if (data->dnode_buf != NULL && memcmp(data->dnode_mdn, mdn, + sizeof(*mdn)) == 0 + && objnum >= data->dnode_start && objnum < data->dnode_end) { + memmove(&(buf->dn), &(data->dnode_buf)[idx], DNODE_SIZE); + buf->endian = data->dnode_endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type: %02X != %02x\n", buf->dn.dn_type, type); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; + } + + err = dmu_read(mdn, blkid, &dnbuf, &endian, data); + if (err) + return err; + + free(data->dnode_buf); + free(data->dnode_mdn); + data->dnode_mdn = malloc(sizeof(*mdn)); + if (!data->dnode_mdn) { + data->dnode_buf = 0; + } else { + memcpy(data->dnode_mdn, mdn, sizeof(*mdn)); + data->dnode_buf = dnbuf; + data->dnode_start = blkid << epbs; + data->dnode_end = (blkid + 1) << epbs; + data->dnode_endian = endian; + } + + memmove(&(buf->dn), (dnode_phys_t *) dnbuf + idx, DNODE_SIZE); + buf->endian = endian; + if (type && buf->dn.dn_type != type) { + printf("incorrect dnode type\n"); + return ZFS_ERR_BAD_FS; + } + + return ZFS_ERR_NONE; +} + +/* + * Get the file dnode for a given file name where mdn is the meta dnode + * for this ZFS object set. When found, place the file dnode in dn. + * The 'path' argument will be mangled. + * + */ +static int +dnode_get_path(dnode_end_t *mdn, const char *path_in, dnode_end_t *dn, + struct zfs_data *data) +{ + uint64_t objnum, version; + char *cname, ch; + int err = ZFS_ERR_NONE; + char *path, *path_buf; + struct dnode_chain { + struct dnode_chain *next; + dnode_end_t dn; + }; + struct dnode_chain *dnode_path = 0, *dn_new, *root; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) + return ZFS_ERR_OUT_OF_MEMORY; + dn_new->next = 0; + dnode_path = root = dn_new; + + err = dnode_get(mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + err = zap_lookup(&(dnode_path->dn), ZPL_VERSION_STR, &version, data); + if (err) { + free(dn_new); + return err; + } + if (version > ZPL_VERSION) { + free(dn_new); + printf("too new ZPL version\n"); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + err = zap_lookup(&(dnode_path->dn), ZFS_ROOT_OBJ, &objnum, data); + if (err) { + free(dn_new); + return err; + } + + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) { + free(dn_new); + return err; + } + + path = path_buf = strdup(path_in); + if (!path_buf) { + free(dn_new); + return ZFS_ERR_OUT_OF_MEMORY; + } + + while (1) { + /* skip leading slashes */ + while (*path == '/') + path++; + if (!*path) + break; + /* get the next component name */ + cname = path; + while (*path && *path != '/') + path++; + /* Skip dot. */ + if (cname + 1 == path && cname[0] == '.') + continue; + /* Handle double dot. */ + if (cname + 2 == path && cname[0] == '.' && cname[1] == '.') { + if (dn_new->next) { + dn_new = dnode_path; + dnode_path = dn_new->next; + free(dn_new); + } else { + printf("can't resolve ..\n"); + err = ZFS_ERR_FILE_NOT_FOUND; + break; + } + continue; + } + + ch = *path; + *path = 0; /* ensure null termination */ + + if (dnode_path->dn.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + free(path_buf); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + err = zap_lookup(&(dnode_path->dn), cname, &objnum, data); + if (err) + break; + + dn_new = malloc(sizeof(*dn_new)); + if (!dn_new) { + err = ZFS_ERR_OUT_OF_MEMORY; + break; + } + dn_new->next = dnode_path; + dnode_path = dn_new; + + objnum = ZFS_DIRENT_OBJ(objnum); + err = dnode_get(mdn, objnum, 0, &(dnode_path->dn), data); + if (err) + break; + + *path = ch; + } + + if (!err) + memcpy(dn, &(dnode_path->dn), sizeof(*dn)); + + while (dnode_path) { + dn_new = dnode_path->next; + free(dnode_path); + dnode_path = dn_new; + } + free(path_buf); + return err; +} + + +/* + * Given a MOS metadnode, get the metadnode of a given filesystem name (fsname), + * e.g. pool/rootfs, or a given object number (obj), e.g. the object number + * of pool/rootfs. + * + * If no fsname and no obj are given, return the DSL_DIR metadnode. + * If fsname is given, return its metadnode and its matching object number. + * If only obj is given, return the metadnode for this object number. + * + */ +static int +get_filesystem_dnode(dnode_end_t *mosmdn, char *fsname, + dnode_end_t *mdn, struct zfs_data *data) +{ + uint64_t objnum; + int err; + + err = dnode_get(mosmdn, DMU_POOL_DIRECTORY_OBJECT, + DMU_OT_OBJECT_DIRECTORY, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, DMU_POOL_ROOT_DATASET, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + while (*fsname) { + uint64_t childobj; + char *cname, ch; + + while (*fsname == '/') + fsname++; + + if (!*fsname || *fsname == '@') + break; + + cname = fsname; + while (*fsname && !isspace(*fsname) && *fsname != '/') + fsname++; + ch = *fsname; + *fsname = 0; + + childobj = zfs_to_cpu64((((dsl_dir_phys_t *) DN_BONUS(&mdn->dn)))->dd_child_dir_zapobj, mdn->endian); + err = dnode_get(mosmdn, childobj, + DMU_OT_DSL_DIR_CHILD_MAP, mdn, data); + if (err) + return err; + + err = zap_lookup(mdn, cname, &objnum, data); + if (err) + return err; + + err = dnode_get(mosmdn, objnum, DMU_OT_DSL_DIR, mdn, data); + if (err) + return err; + + *fsname = ch; + } + return ZFS_ERR_NONE; +} + +static int +make_mdn(dnode_end_t *mdn, struct zfs_data *data) +{ + void *osp; + blkptr_t *bp; + size_t ospsize; + int err; + + bp = &(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_bp); + err = zio_read(bp, mdn->endian, &osp, &ospsize, data); + if (err) + return err; + if (ospsize < OBJSET_PHYS_SIZE_V14) { + free(osp); + printf("too small osp\n"); + return ZFS_ERR_BAD_FS; + } + + mdn->endian = (zfs_to_cpu64(bp->blk_prop, mdn->endian)>>63) & 1; + memmove((char *) &(mdn->dn), + (char *) &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + free(osp); + return ZFS_ERR_NONE; +} + +static int +dnode_get_fullpath(const char *fullpath, dnode_end_t *mdn, + uint64_t *mdnobj, dnode_end_t *dn, int *isfs, + struct zfs_data *data) +{ + char *fsname, *snapname; + const char *ptr_at, *filename; + uint64_t headobj; + int err; + + ptr_at = strchr(fullpath, '@'); + if (!ptr_at) { + *isfs = 1; + filename = 0; + snapname = 0; + fsname = strdup(fullpath); + } else { + const char *ptr_slash = strchr(ptr_at, '/'); + + *isfs = 0; + fsname = malloc(ptr_at - fullpath + 1); + if (!fsname) + return ZFS_ERR_OUT_OF_MEMORY; + memcpy(fsname, fullpath, ptr_at - fullpath); + fsname[ptr_at - fullpath] = 0; + if (ptr_at[1] && ptr_at[1] != '/') { + snapname = malloc(ptr_slash - ptr_at); + if (!snapname) { + free(fsname); + return ZFS_ERR_OUT_OF_MEMORY; + } + memcpy(snapname, ptr_at + 1, ptr_slash - ptr_at - 1); + snapname[ptr_slash - ptr_at - 1] = 0; + } else { + snapname = 0; + } + if (ptr_slash) + filename = ptr_slash; + else + filename = "/"; + printf("zfs fsname = '%s' snapname='%s' filename = '%s'\n", + fsname, snapname, filename); + } + + + err = get_filesystem_dnode(&(data->mos), fsname, dn, data); + + if (err) { + free(fsname); + free(snapname); + return err; + } + + headobj = zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&dn->dn))->dd_head_dataset_obj, dn->endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + + if (snapname) { + uint64_t snapobj; + + snapobj = zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&mdn->dn))->ds_snapnames_zapobj, mdn->endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, mdn, data); + if (!err) + err = zap_lookup(mdn, snapname, &headobj, data); + if (!err) + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, mdn, data); + if (err) { + free(fsname); + free(snapname); + return err; + } + } + + if (mdnobj) + *mdnobj = headobj; + + make_mdn(mdn, data); + + if (*isfs) { + free(fsname); + free(snapname); + return ZFS_ERR_NONE; + } + err = dnode_get_path(mdn, filename, dn, data); + free(fsname); + free(snapname); + return err; +} + +/* + * For a given XDR packed nvlist, verify the first 4 bytes and move on. + * + * An XDR packed nvlist is encoded as (comments from nvs_xdr_create) : + * + * encoding method/host endian (4 bytes) + * nvl_version (4 bytes) + * nvl_nvflag (4 bytes) + * encoded nvpairs: + * encoded size of the nvpair (4 bytes) + * decoded size of the nvpair (4 bytes) + * name string size (4 bytes) + * name string data (sizeof(NV_ALIGN4(string)) + * data type (4 bytes) + * # of elements in the nvpair (4 bytes) + * data + * 2 zero's for the last nvpair + * (end of the entire list) (8 bytes) + * + */ + +static int +nvlist_find_value(char *nvlist, char *name, int valtype, char **val, + size_t *size_out, size_t *nelm_out) +{ + int name_len, type, encode_size; + char *nvpair, *nvp_name; + + /* Verify if the 1st and 2nd byte in the nvlist are valid. */ + /* NOTE: independently of what endianness header announces all + subsequent values are big-endian. */ + if (nvlist[0] != NV_ENCODE_XDR || (nvlist[1] != NV_LITTLE_ENDIAN + && nvlist[1] != NV_BIG_ENDIAN)) { + printf("zfs incorrect nvlist header\n"); + return ZFS_ERR_BAD_FS; + } + + /* skip the header, nvl_version, and nvl_nvflag */ + nvlist = nvlist + 4 * 3; + /* + * Loop thru the nvpair list + * The XDR representation of an integer is in big-endian byte order. + */ + while ((encode_size = be32_to_cpu(*(uint32_t *) nvlist))) { + int nelm; + + nvpair = nvlist + 4 * 2; /* skip the encode/decode size */ + + name_len = be32_to_cpu(*(uint32_t *) nvpair); + nvpair += 4; + + nvp_name = nvpair; + nvpair = nvpair + ((name_len + 3) & ~3); /* align */ + + type = be32_to_cpu(*(uint32_t *) nvpair); + nvpair += 4; + + nelm = be32_to_cpu(*(uint32_t *) nvpair); + if (nelm < 1) { + printf("empty nvpair\n"); + return ZFS_ERR_BAD_FS; + } + + nvpair += 4; + + if ((strncmp(nvp_name, name, name_len) == 0) && type == valtype) { + *val = nvpair; + *size_out = encode_size; + if (nelm_out) + *nelm_out = nelm; + return 1; + } + + nvlist += encode_size; /* goto the next nvpair */ + } + return 0; +} + +int +zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out) +{ + char *nvpair; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0); + if (!found) + return 0; + if (size < sizeof(uint64_t)) { + printf("invalid uint64\n"); + return ZFS_ERR_BAD_FS; + } + + *out = be64_to_cpu(*(uint64_t *) nvpair); + return 1; +} + +char * +zfs_nvlist_lookup_string(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t slen; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_STRING, &nvpair, &size, 0); + if (!found) + return 0; + if (size < 4) { + printf("invalid string\n"); + return 0; + } + slen = be32_to_cpu(*(uint32_t *) nvpair); + if (slen > size - 4) + slen = size - 4; + ret = malloc(slen + 1); + if (!ret) + return 0; + memcpy(ret, nvpair + 4, slen); + ret[slen] = 0; + return ret; +} + +char * +zfs_nvlist_lookup_nvlist(char *nvlist, char *name) +{ + char *nvpair; + char *ret; + size_t size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, 0); + if (!found) + return 0; + ret = calloc(1, size + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpair, size); + return ret; +} + +int +zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name) +{ + char *nvpair; + size_t nelm, size; + int found; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return -1; + return nelm; +} + +char * +zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index) +{ + char *nvpair, *nvpairptr; + int found; + char *ret; + size_t size; + unsigned i; + size_t nelm; + + found = nvlist_find_value(nvlist, name, DATA_TYPE_NVLIST, &nvpair, + &size, &nelm); + if (!found) + return 0; + if (index >= nelm) { + printf("trying to lookup past nvlist array\n"); + return 0; + } + + nvpairptr = nvpair; + + for (i = 0; i < index; i++) { + uint32_t encode_size; + + /* skip the header, nvl_version, and nvl_nvflag */ + nvpairptr = nvpairptr + 4 * 2; + + while (nvpairptr < nvpair + size + && (encode_size = be32_to_cpu(*(uint32_t *) nvpairptr))) + nvlist += encode_size; /* goto the next nvpair */ + + nvlist = nvlist + 4 * 2; /* skip the ending 2 zeros - 8 bytes */ + } + + if (nvpairptr >= nvpair + size + || nvpairptr + be32_to_cpu(*(uint32_t *) (nvpairptr + 4 * 2)) + >= nvpair + size) { + printf("incorrect nvlist array\n"); + return 0; + } + + ret = calloc(1, be32_to_cpu(*(uint32_t *) (nvpairptr + 4 * 2)) + + 3 * sizeof(uint32_t)); + if (!ret) + return 0; + memcpy(ret, nvlist, sizeof(uint32_t)); + + memcpy(ret + sizeof(uint32_t), nvpairptr, size); + return ret; +} + +static int +int_zfs_fetch_nvlist(struct zfs_data *data, char **nvlist) +{ + int err; + + *nvlist = malloc(VDEV_PHYS_SIZE); + /* Read in the vdev name-value pair list (112K). */ + err = zfs_devread(data->vdev_phys_sector, 0, VDEV_PHYS_SIZE, *nvlist); + if (err) { + free(*nvlist); + *nvlist = 0; + return err; + } + return ZFS_ERR_NONE; +} + +/* + * Check the disk label information and retrieve needed vdev name-value pairs. + * + */ +static int +check_pool_label(struct zfs_data *data) +{ + uint64_t pool_state; + char *nvlist; /* for the pool */ + char *vdevnvlist; /* for the vdev */ + uint64_t diskguid; + uint64_t version; + int found; + int err; + + err = int_zfs_fetch_nvlist(data, &nvlist); + if (err) + return err; + + found = zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_STATE, + &pool_state); + if (!found) { + free(nvlist); + printf("zfs pool state not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (pool_state == POOL_STATE_DESTROYED) { + free(nvlist); + printf("zpool is marked as destroyed\n"); + return ZFS_ERR_BAD_FS; + } + + data->label_txg = 0; + found = zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_TXG, + &data->label_txg); + if (!found) { + free(nvlist); + printf("zfs pool txg not found\n"); + return ZFS_ERR_BAD_FS; + } + + /* not an active device */ + if (data->label_txg == 0) { + free(nvlist); + printf("zpool is not active\n"); + return ZFS_ERR_BAD_FS; + } + + found = zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_VERSION, + &version); + if (!found) { + free(nvlist); + printf("zpool config version not found\n"); + return ZFS_ERR_BAD_FS; + } + + if (version > SPA_VERSION) { + free(nvlist); + printf("SPA version too new %llu > %llu\n", + (unsigned long long) version, + (unsigned long long) SPA_VERSION); + return ZFS_ERR_NOT_IMPLEMENTED_YET; + } + + vdevnvlist = zfs_nvlist_lookup_nvlist(nvlist, ZPOOL_CONFIG_VDEV_TREE); + if (!vdevnvlist) { + free(nvlist); + printf("ZFS config vdev tree not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT, + &data->vdev_ashift); + free(vdevnvlist); + if (!found) { + free(nvlist); + printf("ZPOOL config ashift not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_GUID, &diskguid); + if (!found) { + free(nvlist); + printf("ZPOOL config guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + found = zfs_nvlist_lookup_uint64(nvlist, ZPOOL_CONFIG_POOL_GUID, &data->pool_guid); + if (!found) { + free(nvlist); + printf("ZPOOL config pool guid not found\n"); + return ZFS_ERR_BAD_FS; + } + + free(nvlist); + + printf("ZFS Pool GUID: %llu (%016llx) Label: GUID: %llu (%016llx), txg: %llu, SPA v%llu, ashift: %llu\n", + (unsigned long long) data->pool_guid, + (unsigned long long) data->pool_guid, + (unsigned long long) diskguid, + (unsigned long long) diskguid, + (unsigned long long) data->label_txg, + (unsigned long long) version, + (unsigned long long) data->vdev_ashift); + + return ZFS_ERR_NONE; +} + +/* + * vdev_label_start returns the physical disk offset (in bytes) of + * label "l". + */ +static uint64_t vdev_label_start(uint64_t psize, int l) +{ + return (l * sizeof(vdev_label_t) + (l < VDEV_LABELS / 2 ? + 0 : psize - + VDEV_LABELS * sizeof(vdev_label_t))); +} + +void +zfs_unmount(struct zfs_data *data) +{ + free(data->dnode_buf); + free(data->dnode_mdn); + free(data->file_buf); + free(data); +} + +/* + * zfs_mount() locates a valid uberblock of the root pool and read in its MOS + * to the memory address MOS. + * + */ +struct zfs_data * +zfs_mount(device_t dev) +{ + struct zfs_data *data = 0; + int label = 0, bestlabel = -1; + char *ub_array; + uberblock_t *ubbest; + uberblock_t *ubcur = NULL; + void *osp = 0; + size_t ospsize; + int err; + + data = malloc(sizeof(*data)); + if (!data) + return 0; + memset(data, 0, sizeof(*data)); + + ub_array = malloc(VDEV_UBERBLOCK_RING); + if (!ub_array) { + zfs_unmount(data); + return 0; + } + + ubbest = malloc(sizeof(*ubbest)); + if (!ubbest) { + zfs_unmount(data); + return 0; + } + memset(ubbest, 0, sizeof(*ubbest)); + + /* + * some eltorito stacks don't give us a size and + * we end up setting the size to MAXUINT, further + * some of these devices stop working once a single + * read past the end has been issued. Checking + * for a maximum part_length and skipping the backup + * labels at the end of the slice/partition/device + * avoids breaking down on such devices. + */ + const int vdevnum = + dev->part_length == 0 ? + VDEV_LABELS / 2 : VDEV_LABELS; + + /* Size in bytes of the device (disk or partition) aligned to label size*/ + uint64_t device_size = + dev->part_length << SECTOR_BITS; + + const uint64_t alignedbytes = + P2ALIGN(device_size, (uint64_t) sizeof(vdev_label_t)); + + for (label = 0; label < vdevnum; label++) { + uint64_t labelstartbytes = vdev_label_start(alignedbytes, label); + uint64_t labelstart = labelstartbytes >> SECTOR_BITS; + + debug("zfs reading label %d at sector %llu (byte %llu)\n", + label, (unsigned long long) labelstart, + (unsigned long long) labelstartbytes); + + data->vdev_phys_sector = labelstart + + ((VDEV_SKIP_SIZE + VDEV_BOOT_HEADER_SIZE) >> SECTOR_BITS); + + err = check_pool_label(data); + if (err) { + printf("zfs error checking label %d\n", label); + continue; + } + + /* Read in the uberblock ring (128K). */ + err = zfs_devread(data->vdev_phys_sector + + (VDEV_PHYS_SIZE >> SECTOR_BITS), + 0, VDEV_UBERBLOCK_RING, ub_array); + if (err) { + printf("zfs error reading uberblock ring for label %d\n", label); + continue; + } + + ubcur = find_bestub(ub_array, data); + if (!ubcur) { + printf("zfs No good uberblocks found in label %d\n", label); + continue; + } + + if (vdev_uberblock_compare(ubcur, ubbest) > 0) { + /* Looks like the block is good, so use it.*/ + memcpy(ubbest, ubcur, sizeof(*ubbest)); + bestlabel = label; + debug("zfs Current best uberblock found in label %d\n", label); + } + } + free(ub_array); + + /* We zero'd the structure to begin with. If we never assigned to it, + magic will still be zero. */ + if (!ubbest->ub_magic) { + printf("couldn't find a valid ZFS label\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + debug("zfs ubbest %p in label %d\n", ubbest, bestlabel); + + zfs_endian_t ub_endian = + zfs_to_cpu64(ubbest->ub_magic, LITTLE_ENDIAN) == UBERBLOCK_MAGIC + ? LITTLE_ENDIAN : BIG_ENDIAN; + + debug("zfs endian set to %s\n", !ub_endian ? "big" : "little"); + + err = zio_read(&ubbest->ub_rootbp, ub_endian, &osp, &ospsize, data); + + if (err) { + printf("couldn't zio_read object directory\n"); + zfs_unmount(data); + free(ubbest); + return 0; + } + + if (ospsize < OBJSET_PHYS_SIZE_V14) { + printf("osp too small\n"); + zfs_unmount(data); + free(osp); + free(ubbest); + return 0; + } + + /* Got the MOS. Save it at the memory addr MOS. */ + memmove(&(data->mos.dn), &((objset_phys_t *) osp)->os_meta_dnode, DNODE_SIZE); + data->mos.endian = + (zfs_to_cpu64(ubbest->ub_rootbp.blk_prop, ub_endian) >> 63) & 1; + memmove(&(data->current_uberblock), ubbest, sizeof(uberblock_t)); + + free(osp); + free(ubbest); + + return data; +} + +int +zfs_fetch_nvlist(device_t dev, char **nvlist) +{ + struct zfs_data *zfs; + int err; + + zfs = zfs_mount(dev); + if (!zfs) + return ZFS_ERR_BAD_FS; + err = int_zfs_fetch_nvlist(zfs, nvlist); + zfs_unmount(zfs); + return err; +} + +static int +zfs_label(device_t device, char **label) +{ + char *nvlist; + int err; + struct zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = int_zfs_fetch_nvlist(data, &nvlist); + if (err) { + zfs_unmount(data); + return err; + } + + *label = zfs_nvlist_lookup_string(nvlist, ZPOOL_CONFIG_POOL_NAME); + free(nvlist); + zfs_unmount(data); + return ZFS_ERR_NONE; +} + +static int +zfs_uuid(device_t device, char **uuid) +{ + struct zfs_data *data; + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + *uuid = malloc(17); /* %016llx + nil */ + if (!*uuid) + return ZFS_ERR_OUT_OF_MEMORY; + + /* *uuid = xasprintf ("%016llx", (long long unsigned) data->pool_guid);*/ + snprintf(*uuid, 17, "%016llx", (long long unsigned) data->pool_guid); + zfs_unmount(data); + + return ZFS_ERR_NONE; +} + +/* + * zfs_open() locates a file in the rootpool by following the + * MOS and places the dnode of the file in the memory address DNODE. + */ +int +zfs_open(struct zfs_file *file, const char *fsfilename) +{ + struct zfs_data *data; + int err; + int isfs; + + data = zfs_mount(file->device); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), 0, + &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + + if (isfs) { + zfs_unmount(data); + printf("Missing @ or / separator\n"); + return ZFS_ERR_FILE_NOT_FOUND; + } + + /* We found the dnode for this file. Verify if it is a plain file. */ + if (data->dnode.dn.dn_type != DMU_OT_PLAIN_FILE_CONTENTS) { + zfs_unmount(data); + printf("not a file\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + + /* get the file size and set the file position to 0 */ + + /* + * For DMU_OT_SA we will need to locate the SIZE attribute + * attribute, which could be either in the bonus buffer + * or the "spill" block. + */ + if (data->dnode.dn.dn_bonustype == DMU_OT_SA) { + void *sahdrp; + int hdrsize; + + if (data->dnode.dn.dn_bonuslen != 0) { + sahdrp = (sa_hdr_phys_t *) DN_BONUS(&data->dnode.dn); + } else if (data->dnode.dn.dn_flags & DNODE_FLAG_SPILL_BLKPTR) { + blkptr_t *bp = &data->dnode.dn.dn_spill; + + err = zio_read(bp, data->dnode.endian, &sahdrp, NULL, data); + if (err) + return err; + } else { + printf("filesystem is corrupt :(\n"); + return ZFS_ERR_BAD_FS; + } + + hdrsize = SA_HDR_SIZE(((sa_hdr_phys_t *) sahdrp)); + file->size = *(uint64_t *) ((char *) sahdrp + hdrsize + SA_SIZE_OFFSET); + } else { + file->size = zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&data->dnode.dn))->zp_size, data->dnode.endian); + } + + file->data = data; + file->offset = 0; + + return ZFS_ERR_NONE; +} + +uint64_t +zfs_read(zfs_file_t file, char *buf, uint64_t len) +{ + struct zfs_data *data = (struct zfs_data *) file->data; + int blksz, movesize; + uint64_t length; + int64_t red; + int err; + + if (data->file_buf == NULL) { + data->file_buf = malloc(SPA_MAXBLOCKSIZE); + if (!data->file_buf) + return -1; + data->file_start = data->file_end = 0; + } + + /* + * If offset is in memory, move it into the buffer provided and return. + */ + if (file->offset >= data->file_start + && file->offset + len <= data->file_end) { + memmove(buf, data->file_buf + file->offset - data->file_start, + len); + return len; + } + + blksz = zfs_to_cpu16(data->dnode.dn.dn_datablkszsec, + data->dnode.endian) << SPA_MINBLOCKSHIFT; + + /* + * Entire Dnode is too big to fit into the space available. We + * will need to read it in chunks. This could be optimized to + * read in as large a chunk as there is space available, but for + * now, this only reads in one data block at a time. + */ + length = len; + red = 0; + while (length) { + void *t; + /* + * Find requested blkid and the offset within that block. + */ + uint64_t blkid = (file->offset + red) / blksz; + free(data->file_buf); + data->file_buf = 0; + + err = dmu_read(&(data->dnode), blkid, &t, + 0, data); + data->file_buf = t; + if (err) + return -1; + + data->file_start = blkid * blksz; + data->file_end = data->file_start + blksz; + + movesize = MIN(length, data->file_end - (int) file->offset - red); + + memmove(buf, data->file_buf + file->offset + red + - data->file_start, movesize); + buf += movesize; + length -= movesize; + red += movesize; + } + + return len; +} + +int +zfs_close(zfs_file_t file) +{ + zfs_unmount((struct zfs_data *) file->data); + return ZFS_ERR_NONE; +} + +int +zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj) +{ + struct zfs_data *data; + int err; + int isfs; + + data = zfs_mount(dev); + if (!data) + return ZFS_ERR_BAD_FS; + + err = dnode_get_fullpath(fsfilename, &(data->mdn), mdnobj, + &(data->dnode), &isfs, data); + zfs_unmount(data); + return err; +} + +static void +fill_fs_info(struct zfs_dirhook_info *info, + dnode_end_t mdn, struct zfs_data *data) +{ + int err; + dnode_end_t dn; + uint64_t objnum; + uint64_t headobj; + + memset(info, 0, sizeof(*info)); + + info->dir = 1; + + if (mdn.dn.dn_type == DMU_OT_DSL_DIR) { + headobj = zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&mdn.dn))->dd_head_dataset_obj, mdn.endian); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &mdn, data); + if (err) { + printf("zfs failed here 1\n"); + return; + } + } + make_mdn(&mdn, data); + err = dnode_get(&mdn, MASTER_NODE_OBJ, DMU_OT_MASTER_NODE, + &dn, data); + if (err) { + printf("zfs failed here 2\n"); + return; + } + + err = zap_lookup(&dn, ZFS_ROOT_OBJ, &objnum, data); + if (err) { + printf("zfs failed here 3\n"); + return; + } + + err = dnode_get(&mdn, objnum, 0, &dn, data); + if (err) { + printf("zfs failed here 4\n"); + return; + } + + info->mtimeset = 1; + info->mtime = zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + + return; +} + +static int iterate_zap(const char *name, uint64_t val, struct zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t dn; + + memset(&info, 0, sizeof(info)); + + dnode_get(&(data->mdn), val, 0, &dn, data); + info.mtimeset = 1; + info.mtime = zfs_to_cpu64(((znode_phys_t *) DN_BONUS(&dn.dn))->zp_mtime[0], dn.endian); + info.dir = (dn.dn.dn_type == DMU_OT_DIRECTORY_CONTENTS); + debug("zfs type=%d, name=%s\n", + (int)dn.dn.dn_type, (char *)name); + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_fs(const char *name, uint64_t val, struct zfs_data *data) +{ + struct zfs_dirhook_info info; + dnode_end_t mdn; + int err; + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + if (mdn.dn.dn_type != DMU_OT_DSL_DIR) + return 0; + + fill_fs_info(&info, mdn, data); + + if (!data->userhook) + return 0; + return data->userhook(name, &info); +} + +static int iterate_zap_snap(const char *name, uint64_t val, struct zfs_data *data) +{ + struct zfs_dirhook_info info; + char *name2; + int ret = 0; + dnode_end_t mdn; + int err; + + err = dnode_get(&(data->mos), val, 0, &mdn, data); + if (err) + return 0; + + if (mdn.dn.dn_type != DMU_OT_DSL_DATASET) + return 0; + + fill_fs_info(&info, mdn, data); + + name2 = malloc(strlen(name) + 2); + name2[0] = '@'; + memcpy(name2 + 1, name, strlen(name) + 1); + if (data->userhook) + ret = data->userhook(name2, &info); + free(name2); + return ret; +} + +int +zfs_ls(device_t device, const char *path, + int (*hook)(const char *, const struct zfs_dirhook_info *)) +{ + struct zfs_data *data; + int err; + int isfs; +#if 0 + char *label = NULL; + + zfs_label(device, &label); + if (label) + printf("ZPOOL label '%s'\n", + label); +#endif + + data = zfs_mount(device); + if (!data) + return ZFS_ERR_BAD_FS; + + data->userhook = hook; + + err = dnode_get_fullpath(path, &(data->mdn), 0, &(data->dnode), &isfs, data); + if (err) { + zfs_unmount(data); + return err; + } + if (isfs) { + uint64_t childobj, headobj; + uint64_t snapobj; + dnode_end_t dn; + struct zfs_dirhook_info info; + + fill_fs_info(&info, data->dnode, data); + hook("@", &info); + + childobj = zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_child_dir_zapobj, data->dnode.endian); + headobj = zfs_to_cpu64(((dsl_dir_phys_t *) DN_BONUS(&data->dnode.dn))->dd_head_dataset_obj, data->dnode.endian); + err = dnode_get(&(data->mos), childobj, + DMU_OT_DSL_DIR_CHILD_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + + zap_iterate(&dn, iterate_zap_fs, data); + + err = dnode_get(&(data->mos), headobj, DMU_OT_DSL_DATASET, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + snapobj = zfs_to_cpu64(((dsl_dataset_phys_t *) DN_BONUS(&dn.dn))->ds_snapnames_zapobj, dn.endian); + + err = dnode_get(&(data->mos), snapobj, + DMU_OT_DSL_DS_SNAP_MAP, &dn, data); + if (err) { + zfs_unmount(data); + return err; + } + + zap_iterate(&dn, iterate_zap_snap, data); + } else { + if (data->dnode.dn.dn_type != DMU_OT_DIRECTORY_CONTENTS) { + zfs_unmount(data); + printf("not a directory\n"); + return ZFS_ERR_BAD_FILE_TYPE; + } + zap_iterate(&(data->dnode), iterate_zap, data); + } + zfs_unmount(data); + return ZFS_ERR_NONE; +} + diff --git a/fs/zfs/zfs_fletcher.c b/fs/zfs/zfs_fletcher.c new file mode 100644 index 0000000..28a48de --- /dev/null +++ b/fs/zfs/zfs_fletcher.c @@ -0,0 +1,88 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +void +fletcher_2_endian(const void *buf, uint64_t size, + zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint64_t *ip = buf; + const uint64_t *ipend = ip + (size / sizeof(uint64_t)); + uint64_t a0, b0, a1, b1; + + for (a0 = b0 = a1 = b1 = 0; ip < ipend; ip += 2) { + a0 += zfs_to_cpu64(ip[0], endian); + a1 += zfs_to_cpu64(ip[1], endian); + b0 += a0; + b1 += a1; + } + + zcp->zc_word[0] = cpu_to_zfs64(a0, endian); + zcp->zc_word[1] = cpu_to_zfs64(a1, endian); + zcp->zc_word[2] = cpu_to_zfs64(b0, endian); + zcp->zc_word[3] = cpu_to_zfs64(b1, endian); +} + +void +fletcher_4_endian(const void *buf, uint64_t size, zfs_endian_t endian, + zio_cksum_t *zcp) +{ + const uint32_t *ip = buf; + const uint32_t *ipend = ip + (size / sizeof(uint32_t)); + uint64_t a, b, c, d; + + for (a = b = c = d = 0; ip < ipend; ip++) { + a += zfs_to_cpu32(ip[0], endian); + b += a; + c += b; + d += c; + } + + zcp->zc_word[0] = cpu_to_zfs64(a, endian); + zcp->zc_word[1] = cpu_to_zfs64(b, endian); + zcp->zc_word[2] = cpu_to_zfs64(c, endian); + zcp->zc_word[3] = cpu_to_zfs64(d, endian); +} + diff --git a/fs/zfs/zfs_lzjb.c b/fs/zfs/zfs_lzjb.c new file mode 100644 index 0000000..b22d7e1 --- /dev/null +++ b/fs/zfs/zfs_lzjb.c @@ -0,0 +1,97 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +#define MATCH_BITS 6 +#define MATCH_MIN 3 +#define OFFSET_MASK ((1 << (16 - MATCH_BITS)) - 1) + +/* + * Decompression Entry - lzjb + */ +#ifndef NBBY +#define NBBY 8 +#endif + +int +lzjb_decompress(void *s_start, void *d_start, uint32_t s_len, + uint32_t d_len) +{ + uint8_t *src = s_start; + uint8_t *dst = d_start; + uint8_t *d_end = (uint8_t *) d_start + d_len; + uint8_t *s_end = (uint8_t *) s_start + s_len; + uint8_t *cpy, copymap = 0; + int copymask = 1 << (NBBY - 1); + + while (dst < d_end && src < s_end) { + if ((copymask <<= 1) == (1 << NBBY)) { + copymask = 1; + copymap = *src++; + } + if (src >= s_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + if (copymap & copymask) { + int mlen = (src[0] >> (NBBY - MATCH_BITS)) + MATCH_MIN; + int offset = ((src[0] << NBBY) | src[1]) & OFFSET_MASK; + src += 2; + cpy = dst - offset; + if (src > s_end || cpy < (uint8_t *) d_start) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + while (--mlen >= 0 && dst < d_end) + *dst++ = *cpy++; + } else { + *dst++ = *src++; + } + } + if (dst < d_end) { + printf("lzjb decompression failed\n"); + return ZFS_ERR_BAD_FS; + } + return ZFS_ERR_NONE; +} diff --git a/fs/zfs/zfs_sha256.c b/fs/zfs/zfs_sha256.c new file mode 100644 index 0000000..f1a4d97 --- /dev/null +++ b/fs/zfs/zfs_sha256.c @@ -0,0 +1,148 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#include <common.h> +#include <malloc.h> +#include <linux/stat.h> +#include <linux/time.h> +#include <linux/ctype.h> +#include <asm/byteorder.h> +#include "zfs_common.h" + +#include <zfs/zfs.h> +#include <zfs/zio.h> +#include <zfs/dnode.h> +#include <zfs/uberblock_impl.h> +#include <zfs/vdev_impl.h> +#include <zfs/zio_checksum.h> +#include <zfs/zap_impl.h> +#include <zfs/zap_leaf.h> +#include <zfs/zfs_znode.h> +#include <zfs/dmu.h> +#include <zfs/dmu_objset.h> +#include <zfs/dsl_dir.h> +#include <zfs/dsl_dataset.h> + +/* + * SHA-256 checksum, as specified in FIPS 180-2, available at: + * http://csrc.nist.gov/cryptval + * + * This is a very compact implementation of SHA-256. + * It is designed to be simple and portable, not to be fast. + */ + +/* + * The literal definitions according to FIPS180-2 would be: + * + * Ch(x, y, z) (((x) & (y)) ^ ((~(x)) & (z))) + * Maj(x, y, z) (((x) & (y)) | ((x) & (z)) | ((y) & (z))) + * + * We use logical equivalents which require one less op. + */ +#define Ch(x, y, z) ((z) ^ ((x) & ((y) ^ (z)))) +#define Maj(x, y, z) (((x) & (y)) ^ ((z) & ((x) ^ (y)))) +#define Rot32(x, s) (((x) >> s) | ((x) << (32 - s))) +#define SIGMA0(x) (Rot32(x, 2) ^ Rot32(x, 13) ^ Rot32(x, 22)) +#define SIGMA1(x) (Rot32(x, 6) ^ Rot32(x, 11) ^ Rot32(x, 25)) +#define sigma0(x) (Rot32(x, 7) ^ Rot32(x, 18) ^ ((x) >> 3)) +#define sigma1(x) (Rot32(x, 17) ^ Rot32(x, 19) ^ ((x) >> 10)) + +static const uint32_t SHA256_K[64] = { + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, + 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, + 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, + 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, + 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, + 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, + 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, + 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, + 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, + 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, + 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, + 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, + 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, + 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +}; + +static void +SHA256Transform(uint32_t *H, const uint8_t *cp) +{ + uint32_t a, b, c, d, e, f, g, h, t, T1, T2, W[64]; + + for (t = 0; t < 16; t++, cp += 4) + W[t] = (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | cp[3]; + + for (t = 16; t < 64; t++) + W[t] = sigma1(W[t - 2]) + W[t - 7] + + sigma0(W[t - 15]) + W[t - 16]; + + a = H[0]; b = H[1]; c = H[2]; d = H[3]; + e = H[4]; f = H[5]; g = H[6]; h = H[7]; + + for (t = 0; t < 64; t++) { + T1 = h + SIGMA1(e) + Ch(e, f, g) + SHA256_K[t] + W[t]; + T2 = SIGMA0(a) + Maj(a, b, c); + h = g; g = f; f = e; e = d + T1; + d = c; c = b; b = a; a = T1 + T2; + } + + H[0] += a; H[1] += b; H[2] += c; H[3] += d; + H[4] += e; H[5] += f; H[6] += g; H[7] += h; +} + +void +zio_checksum_SHA256(const void *buf, uint64_t size, + zfs_endian_t endian, zio_cksum_t *zcp) +{ + uint32_t H[8] = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, + 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 }; + uint8_t pad[128]; + unsigned padsize = size & 63; + unsigned i; + + for (i = 0; i < size - padsize; i += 64) + SHA256Transform(H, (uint8_t *)buf + i); + + for (i = 0; i < padsize; i++) + pad[i] = ((uint8_t *)buf)[i]; + + for (pad[padsize++] = 0x80; (padsize & 63) != 56; padsize++) + pad[padsize] = 0; + + for (i = 0; i < 8; i++) + pad[padsize++] = (size << 3) >> (56 - 8 * i); + + for (i = 0; i < padsize; i += 64) + SHA256Transform(H, pad + i); + + zcp->zc_word[0] = cpu_to_zfs64((uint64_t)H[0] << 32 | H[1], + endian); + zcp->zc_word[1] = cpu_to_zfs64((uint64_t)H[2] << 32 | H[3], + endian); + zcp->zc_word[2] = cpu_to_zfs64((uint64_t)H[4] << 32 | H[5], + endian); + zcp->zc_word[3] = cpu_to_zfs64((uint64_t)H[6] << 32 | H[7], + endian); +} diff --git a/include/config_cmd_all.h b/include/config_cmd_all.h index 73c0558..f434cd0 100644 --- a/include/config_cmd_all.h +++ b/include/config_cmd_all.h @@ -87,5 +87,6 @@ #define CONFIG_CMD_UNZIP /* unzip from memory to memory */ #define CONFIG_CMD_USB /* USB Support */ #define CONFIG_CMD_XIMG /* Load part of Multi Image */ +#define CONFIG_CMD_ZFS /* ZFS Support */
#endif /* _CONFIG_CMD_ALL_H */ diff --git a/include/zfs/dmu.h b/include/zfs/dmu.h new file mode 100644 index 0000000..7faa708 --- /dev/null +++ b/include/zfs/dmu.h @@ -0,0 +1,120 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_H +#define _SYS_DMU_H + +/* + * This file describes the interface that the DMU provides for its + * consumers. + * + * The DMU also interacts with the SPA. That interface is described in + * dmu_spa.h. + */ +typedef enum dmu_object_type { + DMU_OT_NONE, + /* general: */ + DMU_OT_OBJECT_DIRECTORY, /* ZAP */ + DMU_OT_OBJECT_ARRAY, /* UINT64 */ + DMU_OT_PACKED_NVLIST, /* UINT8 (XDR by nvlist_pack/unpack) */ + DMU_OT_PACKED_NVLIST_SIZE, /* UINT64 */ + DMU_OT_BPLIST, /* UINT64 */ + DMU_OT_BPLIST_HDR, /* UINT64 */ + /* spa: */ + DMU_OT_SPACE_MAP_HEADER, /* UINT64 */ + DMU_OT_SPACE_MAP, /* UINT64 */ + /* zil: */ + DMU_OT_INTENT_LOG, /* UINT64 */ + /* dmu: */ + DMU_OT_DNODE, /* DNODE */ + DMU_OT_OBJSET, /* OBJSET */ + /* dsl: */ + DMU_OT_DSL_DIR, /* UINT64 */ + DMU_OT_DSL_DIR_CHILD_MAP, /* ZAP */ + DMU_OT_DSL_DS_SNAP_MAP, /* ZAP */ + DMU_OT_DSL_PROPS, /* ZAP */ + DMU_OT_DSL_DATASET, /* UINT64 */ + /* zpl: */ + DMU_OT_ZNODE, /* ZNODE */ + DMU_OT_OLDACL, /* OLD ACL */ + DMU_OT_PLAIN_FILE_CONTENTS, /* UINT8 */ + DMU_OT_DIRECTORY_CONTENTS, /* ZAP */ + DMU_OT_MASTER_NODE, /* ZAP */ + DMU_OT_UNLINKED_SET, /* ZAP */ + /* zvol: */ + DMU_OT_ZVOL, /* UINT8 */ + DMU_OT_ZVOL_PROP, /* ZAP */ + /* other; for testing only! */ + DMU_OT_PLAIN_OTHER, /* UINT8 */ + DMU_OT_UINT64_OTHER, /* UINT64 */ + DMU_OT_ZAP_OTHER, /* ZAP */ + /* new object types: */ + DMU_OT_ERROR_LOG, /* ZAP */ + DMU_OT_SPA_HISTORY, /* UINT8 */ + DMU_OT_SPA_HISTORY_OFFSETS, /* spa_his_phys_t */ + DMU_OT_POOL_PROPS, /* ZAP */ + DMU_OT_DSL_PERMS, /* ZAP */ + DMU_OT_ACL, /* ACL */ + DMU_OT_SYSACL, /* SYSACL */ + DMU_OT_FUID, /* FUID table (Packed NVLIST UINT8) */ + DMU_OT_FUID_SIZE, /* FUID table size UINT64 */ + DMU_OT_NEXT_CLONES, /* ZAP */ + DMU_OT_SCRUB_QUEUE, /* ZAP */ + DMU_OT_USERGROUP_USED, /* ZAP */ + DMU_OT_USERGROUP_QUOTA, /* ZAP */ + DMU_OT_USERREFS, /* ZAP */ + DMU_OT_DDT_ZAP, /* ZAP */ + DMU_OT_DDT_STATS, /* ZAP */ + DMU_OT_SA, /* System attr */ + DMU_OT_SA_MASTER_NODE, /* ZAP */ + DMU_OT_SA_ATTR_REGISTRATION, /* ZAP */ + DMU_OT_SA_ATTR_LAYOUTS, /* ZAP */ + DMU_OT_NUMTYPES +} dmu_object_type_t; + +typedef enum dmu_objset_type { + DMU_OST_NONE, + DMU_OST_META, + DMU_OST_ZFS, + DMU_OST_ZVOL, + DMU_OST_OTHER, /* For testing only! */ + DMU_OST_ANY, /* Be careful! */ + DMU_OST_NUMTYPES +} dmu_objset_type_t; + +/* + * The names of zap entries in the DIRECTORY_OBJECT of the MOS. + */ +#define DMU_POOL_DIRECTORY_OBJECT 1 +#define DMU_POOL_CONFIG "config" +#define DMU_POOL_ROOT_DATASET "root_dataset" +#define DMU_POOL_SYNC_BPLIST "sync_bplist" +#define DMU_POOL_ERRLOG_SCRUB "errlog_scrub" +#define DMU_POOL_ERRLOG_LAST "errlog_last" +#define DMU_POOL_SPARES "spares" +#define DMU_POOL_DEFLATE "deflate" +#define DMU_POOL_HISTORY "history" +#define DMU_POOL_PROPS "pool_props" +#define DMU_POOL_L2CACHE "l2cache" + +#endif /* _SYS_DMU_H */ diff --git a/include/zfs/dmu_objset.h b/include/zfs/dmu_objset.h new file mode 100644 index 0000000..dbd67ad --- /dev/null +++ b/include/zfs/dmu_objset.h @@ -0,0 +1,43 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DMU_OBJSET_H +#define _SYS_DMU_OBJSET_H + +#include <zfs/zil.h> + +#define OBJSET_PHYS_SIZE 2048 +#define OBJSET_PHYS_SIZE_V14 1024 + +typedef struct objset_phys { + dnode_phys_t os_meta_dnode; + zil_header_t os_zil_header; + uint64_t os_type; + uint64_t os_flags; + char os_pad[OBJSET_PHYS_SIZE - sizeof(dnode_phys_t)*3 - + sizeof(zil_header_t) - sizeof(uint64_t)*2]; + dnode_phys_t os_userused_dnode; + dnode_phys_t os_groupused_dnode; +} objset_phys_t; + +#endif /* _SYS_DMU_OBJSET_H */ diff --git a/include/zfs/dnode.h b/include/zfs/dnode.h new file mode 100644 index 0000000..09b4562 --- /dev/null +++ b/include/zfs/dnode.h @@ -0,0 +1,81 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DNODE_H +#define _SYS_DNODE_H + +#include <zfs/spa.h> + +/* + * Fixed constants. + */ +#define DNODE_SHIFT 9 /* 512 bytes */ +#define DN_MIN_INDBLKSHIFT 10 /* 1k */ +#define DN_MAX_INDBLKSHIFT 14 /* 16k */ +#define DNODE_BLOCK_SHIFT 14 /* 16k */ +#define DNODE_CORE_SIZE 64 /* 64 bytes for dnode sans blkptrs */ +#define DN_MAX_OBJECT_SHIFT 48 /* 256 trillion (zfs_fid_t limit) */ +#define DN_MAX_OFFSET_SHIFT 64 /* 2^64 bytes in a dnode */ + +/* + * Derived constants. + */ +#define DNODE_SIZE (1 << DNODE_SHIFT) +#define DN_MAX_NBLKPTR ((DNODE_SIZE - DNODE_CORE_SIZE) >> SPA_BLKPTRSHIFT) +#define DN_MAX_BONUSLEN (DNODE_SIZE - DNODE_CORE_SIZE - (1 << SPA_BLKPTRSHIFT)) +#define DN_MAX_OBJECT (1ULL << DN_MAX_OBJECT_SHIFT) + +#define DNODES_PER_BLOCK_SHIFT (DNODE_BLOCK_SHIFT - DNODE_SHIFT) +#define DNODES_PER_BLOCK (1ULL << DNODES_PER_BLOCK_SHIFT) +#define DNODES_PER_LEVEL_SHIFT (DN_MAX_INDBLKSHIFT - SPA_BLKPTRSHIFT) + +#define DNODE_FLAG_SPILL_BLKPTR (1<<2) + +#define DN_BONUS(dnp) ((void *)((dnp)->dn_bonus + \ + (((dnp)->dn_nblkptr - 1) * sizeof(blkptr_t)))) + +typedef struct dnode_phys { + uint8_t dn_type; /* dmu_object_type_t */ + uint8_t dn_indblkshift; /* ln2(indirect block size) */ + uint8_t dn_nlevels; /* 1=dn_blkptr->data blocks */ + uint8_t dn_nblkptr; /* length of dn_blkptr */ + uint8_t dn_bonustype; /* type of data in bonus buffer */ + uint8_t dn_checksum; /* ZIO_CHECKSUM type */ + uint8_t dn_compress; /* ZIO_COMPRESS type */ + uint8_t dn_flags; /* DNODE_FLAG_* */ + uint16_t dn_datablkszsec; /* data block size in 512b sectors */ + uint16_t dn_bonuslen; /* length of dn_bonus */ + uint8_t dn_pad2[4]; + + /* accounting is protected by dn_dirty_mtx */ + uint64_t dn_maxblkid; /* largest allocated block ID */ + uint64_t dn_used; /* bytes (or sectors) of disk space */ + + uint64_t dn_pad3[4]; + + blkptr_t dn_blkptr[1]; + uint8_t dn_bonus[DN_MAX_BONUSLEN - sizeof(blkptr_t)]; + blkptr_t dn_spill; +} dnode_phys_t; + +#endif /* _SYS_DNODE_H */ diff --git a/include/zfs/dsl_dataset.h b/include/zfs/dsl_dataset.h new file mode 100644 index 0000000..ab0ee22 --- /dev/null +++ b/include/zfs/dsl_dataset.h @@ -0,0 +1,53 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DATASET_H +#define _SYS_DSL_DATASET_H + +typedef struct dsl_dataset_phys { + uint64_t ds_dir_obj; + uint64_t ds_prev_snap_obj; + uint64_t ds_prev_snap_txg; + uint64_t ds_next_snap_obj; + uint64_t ds_snapnames_zapobj; /* zap obj of snaps; ==0 for snaps */ + uint64_t ds_num_children; /* clone/snap children; ==0 for head */ + uint64_t ds_creation_time; /* seconds since 1970 */ + uint64_t ds_creation_txg; + uint64_t ds_deadlist_obj; + uint64_t ds_used_bytes; + uint64_t ds_compressed_bytes; + uint64_t ds_uncompressed_bytes; + uint64_t ds_unique_bytes; /* only relevant to snapshots */ + /* + * The ds_fsid_guid is a 56-bit ID that can change to avoid + * collisions. The ds_guid is a 64-bit ID that will never + * change, so there is a small probability that it will collide. + */ + uint64_t ds_fsid_guid; + uint64_t ds_guid; + uint64_t ds_flags; + blkptr_t ds_bp; + uint64_t ds_pad[8]; /* pad out to 320 bytes for good measure */ +} dsl_dataset_phys_t; + +#endif /* _SYS_DSL_DATASET_H */ diff --git a/include/zfs/dsl_dir.h b/include/zfs/dsl_dir.h new file mode 100644 index 0000000..54d6663 --- /dev/null +++ b/include/zfs/dsl_dir.h @@ -0,0 +1,49 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_DSL_DIR_H +#define _SYS_DSL_DIR_H + +typedef struct dsl_dir_phys { + uint64_t dd_creation_time; /* not actually used */ + uint64_t dd_head_dataset_obj; + uint64_t dd_parent_obj; + uint64_t dd_clone_parent_obj; + uint64_t dd_child_dir_zapobj; + /* + * how much space our children are accounting for; for leaf + * datasets, == physical space used by fs + snaps + */ + uint64_t dd_used_bytes; + uint64_t dd_compressed_bytes; + uint64_t dd_uncompressed_bytes; + /* Administrative quota setting */ + uint64_t dd_quota; + /* Administrative reservation setting */ + uint64_t dd_reserved; + uint64_t dd_props_zapobj; + uint64_t dd_deleg_zapobj; /* dataset permissions */ + uint64_t dd_pad[20]; /* pad out to 256 bytes for good measure */ +} dsl_dir_phys_t; + +#endif /* _SYS_DSL_DIR_H */ diff --git a/include/zfs/sa_impl.h b/include/zfs/sa_impl.h new file mode 100644 index 0000000..4d93558 --- /dev/null +++ b/include/zfs/sa_impl.h @@ -0,0 +1,35 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ +#ifndef _SYS_SA_IMPL_H +#define _SYS_SA_IMPL_H + +typedef struct sa_hdr_phys { + uint32_t sa_magic; + uint16_t sa_layout_info; + uint16_t sa_lengths[1]; +} sa_hdr_phys_t; + +#define SA_HDR_SIZE(hdr) BF32_GET_SB(hdr->sa_layout_info, 10, 16, 3, 0) +#define SA_SIZE_OFFSET 0x8 + +#endif /* _SYS_SA_IMPL_H */ diff --git a/include/zfs/spa.h b/include/zfs/spa.h new file mode 100644 index 0000000..360cf89 --- /dev/null +++ b/include/zfs/spa.h @@ -0,0 +1,292 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright (c) 2008, 2011, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef ZFS_SPA_HEADER +#define ZFS_SPA_HEADER 1 + + +/* + * General-purpose 32-bit and 64-bit bitfield encodings. + */ +#define BF32_DECODE(x, low, len) P2PHASE((x) >> (low), 1U << (len)) +#define BF64_DECODE(x, low, len) P2PHASE((x) >> (low), 1ULL << (len)) +#define BF32_ENCODE(x, low, len) (P2PHASE((x), 1U << (len)) << (low)) +#define BF64_ENCODE(x, low, len) (P2PHASE((x), 1ULL << (len)) << (low)) + +#define BF32_GET(x, low, len) BF32_DECODE(x, low, len) +#define BF64_GET(x, low, len) BF64_DECODE(x, low, len) + +#define BF32_SET(x, low, len, val) \ + ((x) ^= BF32_ENCODE((x >> low) ^ (val), low, len)) +#define BF64_SET(x, low, len, val) \ + ((x) ^= BF64_ENCODE((x >> low) ^ (val), low, len)) + +#define BF32_GET_SB(x, low, len, shift, bias) \ + ((BF32_GET(x, low, len) + (bias)) << (shift)) +#define BF64_GET_SB(x, low, len, shift, bias) \ + ((BF64_GET(x, low, len) + (bias)) << (shift)) + +#define BF32_SET_SB(x, low, len, shift, bias, val) \ + BF32_SET(x, low, len, ((val) >> (shift)) - (bias)) +#define BF64_SET_SB(x, low, len, shift, bias, val) \ + BF64_SET(x, low, len, ((val) >> (shift)) - (bias)) + +/* + * We currently support nine block sizes, from 512 bytes to 128K. + * We could go higher, but the benefits are near-zero and the cost + * of COWing a giant block to modify one byte would become excessive. + */ +#define SPA_MINBLOCKSHIFT 9 +#define SPA_MAXBLOCKSHIFT 17 +#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT) +#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT) + +#define SPA_BLOCKSIZES (SPA_MAXBLOCKSHIFT - SPA_MINBLOCKSHIFT + 1) + +/* + * Size of block to hold the configuration data (a packed nvlist) + */ +#define SPA_CONFIG_BLOCKSIZE (1 << 14) + +/* + * The DVA size encodings for LSIZE and PSIZE support blocks up to 32MB. + * The ASIZE encoding should be at least 64 times larger (6 more bits) + * to support up to 4-way RAID-Z mirror mode with worst-case gang block + * overhead, three DVAs per bp, plus one more bit in case we do anything + * else that expands the ASIZE. + */ +#define SPA_LSIZEBITS 16 /* LSIZE up to 32M (2^16 * 512) */ +#define SPA_PSIZEBITS 16 /* PSIZE up to 32M (2^16 * 512) */ +#define SPA_ASIZEBITS 24 /* ASIZE up to 64 times larger */ + +/* + * All SPA data is represented by 128-bit data virtual addresses (DVAs). + * The members of the dva_t should be considered opaque outside the SPA. + */ +typedef struct dva { + uint64_t dva_word[2]; +} dva_t; + +/* + * Each block has a 256-bit checksum -- strong enough for cryptographic hashes. + */ +typedef struct zio_cksum { + uint64_t zc_word[4]; +} zio_cksum_t; + +/* + * Each block is described by its DVAs, time of birth, checksum, etc. + * The word-by-word, bit-by-bit layout of the blkptr is as follows: + * + * 64 56 48 40 32 24 16 8 0 + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 0 | vdev1 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 1 |G| offset1 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 2 | vdev2 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 3 |G| offset2 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 4 | vdev3 | GRID | ASIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 5 |G| offset3 | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 6 |BDX|lvl| type | cksum | comp | PSIZE | LSIZE | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 7 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 8 | padding | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * 9 | physical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * a | logical birth txg | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * b | fill count | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * c | checksum[0] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * d | checksum[1] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * e | checksum[2] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * f | checksum[3] | + * +-------+-------+-------+-------+-------+-------+-------+-------+ + * + * Legend: + * + * vdev virtual device ID + * offset offset into virtual device + * LSIZE logical size + * PSIZE physical size (after compression) + * ASIZE allocated size (including RAID-Z parity and gang block headers) + * GRID RAID-Z layout information (reserved for future use) + * cksum checksum function + * comp compression function + * G gang block indicator + * B byteorder (endianness) + * D dedup + * X unused + * lvl level of indirection + * type DMU object type + * phys birth txg of block allocation; zero if same as logical birth txg + * log. birth transaction group in which the block was logically born + * fill count number of non-zero blocks under this bp + * checksum[4] 256-bit checksum of the data this bp describes + */ +#define SPA_BLKPTRSHIFT 7 /* blkptr_t is 128 bytes */ +#define SPA_DVAS_PER_BP 3 /* Number of DVAs in a bp */ + +typedef struct blkptr { + dva_t blk_dva[SPA_DVAS_PER_BP]; /* Data Virtual Addresses */ + uint64_t blk_prop; /* size, compression, type, etc */ + uint64_t blk_pad[2]; /* Extra space for the future */ + uint64_t blk_phys_birth; /* txg when block was allocated */ + uint64_t blk_birth; /* transaction group at birth */ + uint64_t blk_fill; /* fill count */ + zio_cksum_t blk_cksum; /* 256-bit checksum */ +} blkptr_t; + +/* + * Macros to get and set fields in a bp or DVA. + */ +#define DVA_GET_ASIZE(dva) \ + BF64_GET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0) +#define DVA_SET_ASIZE(dva, x) \ + BF64_SET_SB((dva)->dva_word[0], 0, 24, SPA_MINBLOCKSHIFT, 0, x) + +#define DVA_GET_GRID(dva) BF64_GET((dva)->dva_word[0], 24, 8) +#define DVA_SET_GRID(dva, x) BF64_SET((dva)->dva_word[0], 24, 8, x) + +#define DVA_GET_VDEV(dva) BF64_GET((dva)->dva_word[0], 32, 32) +#define DVA_SET_VDEV(dva, x) BF64_SET((dva)->dva_word[0], 32, 32, x) + +#define DVA_GET_GANG(dva) BF64_GET((dva)->dva_word[1], 63, 1) +#define DVA_SET_GANG(dva, x) BF64_SET((dva)->dva_word[1], 63, 1, x) + +#define BP_GET_LSIZE(bp) \ + BF64_GET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1) +#define BP_SET_LSIZE(bp, x) \ + BF64_SET_SB((bp)->blk_prop, 0, 16, SPA_MINBLOCKSHIFT, 1, x) + +#define BP_GET_COMPRESS(bp) BF64_GET((bp)->blk_prop, 32, 8) +#define BP_SET_COMPRESS(bp, x) BF64_SET((bp)->blk_prop, 32, 8, x) + +#define BP_GET_CHECKSUM(bp) BF64_GET((bp)->blk_prop, 40, 8) +#define BP_SET_CHECKSUM(bp, x) BF64_SET((bp)->blk_prop, 40, 8, x) + +#define BP_GET_TYPE(bp) BF64_GET((bp)->blk_prop, 48, 8) +#define BP_SET_TYPE(bp, x) BF64_SET((bp)->blk_prop, 48, 8, x) + +#define BP_GET_LEVEL(bp) BF64_GET((bp)->blk_prop, 56, 5) +#define BP_SET_LEVEL(bp, x) BF64_SET((bp)->blk_prop, 56, 5, x) + +#define BP_GET_PROP_BIT_61(bp) BF64_GET((bp)->blk_prop, 61, 1) +#define BP_SET_PROP_BIT_61(bp, x) BF64_SET((bp)->blk_prop, 61, 1, x) + +#define BP_GET_DEDUP(bp) BF64_GET((bp)->blk_prop, 62, 1) +#define BP_SET_DEDUP(bp, x) BF64_SET((bp)->blk_prop, 62, 1, x) + +#define BP_GET_BYTEORDER(bp) (0 - BF64_GET((bp)->blk_prop, 63, 1)) +#define BP_SET_BYTEORDER(bp, x) BF64_SET((bp)->blk_prop, 63, 1, x) + +#define BP_PHYSICAL_BIRTH(bp) \ + ((bp)->blk_phys_birth ? (bp)->blk_phys_birth : (bp)->blk_birth) + +#define BP_SET_BIRTH(bp, logical, physical) \ + { \ + (bp)->blk_birth = (logical); \ + (bp)->blk_phys_birth = ((logical) == (physical) ? 0 : (physical)); \ + } + +#define BP_GET_ASIZE(bp) \ + (DVA_GET_ASIZE(&(bp)->blk_dva[0]) + DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_GET_UCSIZE(bp) \ + ((BP_GET_LEVEL(bp) > 0 || dmu_ot[BP_GET_TYPE(bp)].ot_metadata) ? \ + BP_GET_PSIZE(bp) : BP_GET_LSIZE(bp)); + +#define BP_GET_NDVAS(bp) \ + (!!DVA_GET_ASIZE(&(bp)->blk_dva[0]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[1]) + \ + !!DVA_GET_ASIZE(&(bp)->blk_dva[2])) + +#define BP_COUNT_GANG(bp) \ + (DVA_GET_GANG(&(bp)->blk_dva[0]) + \ + DVA_GET_GANG(&(bp)->blk_dva[1]) + \ + DVA_GET_GANG(&(bp)->blk_dva[2])) + +#define DVA_EQUAL(dva1, dva2) \ + ((dva1)->dva_word[1] == (dva2)->dva_word[1] && \ + (dva1)->dva_word[0] == (dva2)->dva_word[0]) + +#define BP_EQUAL(bp1, bp2) \ + (BP_PHYSICAL_BIRTH(bp1) == BP_PHYSICAL_BIRTH(bp2) && \ + DVA_EQUAL(&(bp1)->blk_dva[0], &(bp2)->blk_dva[0]) && \ + DVA_EQUAL(&(bp1)->blk_dva[1], &(bp2)->blk_dva[1]) && \ + DVA_EQUAL(&(bp1)->blk_dva[2], &(bp2)->blk_dva[2])) + +#define ZIO_CHECKSUM_EQUAL(zc1, zc2) \ + (0 == (((zc1).zc_word[0] - (zc2).zc_word[0]) | \ + ((zc1).zc_word[1] - (zc2).zc_word[1]) | \ + ((zc1).zc_word[2] - (zc2).zc_word[2]) | \ + ((zc1).zc_word[3] - (zc2).zc_word[3]))) + +#define DVA_IS_VALID(dva) (DVA_GET_ASIZE(dva) != 0) + +#define ZIO_SET_CHECKSUM(zcp, w0, w1, w2, w3) \ + { \ + (zcp)->zc_word[0] = w0; \ + (zcp)->zc_word[1] = w1; \ + (zcp)->zc_word[2] = w2; \ + (zcp)->zc_word[3] = w3; \ + } + +#define BP_IDENTITY(bp) (&(bp)->blk_dva[0]) +#define BP_IS_GANG(bp) DVA_GET_GANG(BP_IDENTITY(bp)) +#define BP_IS_HOLE(bp) ((bp)->blk_birth == 0) + +/* BP_IS_RAIDZ(bp) assumes no block compression */ +#define BP_IS_RAIDZ(bp) (DVA_GET_ASIZE(&(bp)->blk_dva[0]) > \ + BP_GET_PSIZE(bp)) + +#define BP_ZERO(bp) \ + { \ + (bp)->blk_dva[0].dva_word[0] = 0; \ + (bp)->blk_dva[0].dva_word[1] = 0; \ + (bp)->blk_dva[1].dva_word[0] = 0; \ + (bp)->blk_dva[1].dva_word[1] = 0; \ + (bp)->blk_dva[2].dva_word[0] = 0; \ + (bp)->blk_dva[2].dva_word[1] = 0; \ + (bp)->blk_prop = 0; \ + (bp)->blk_pad[0] = 0; \ + (bp)->blk_pad[1] = 0; \ + (bp)->blk_phys_birth = 0; \ + (bp)->blk_birth = 0; \ + (bp)->blk_fill = 0; \ + ZIO_SET_CHECKSUM(&(bp)->blk_cksum, 0, 0, 0, 0); \ + } + +#define BP_SPRINTF_LEN 320 + +#endif /* ! ZFS_SPA_HEADER */ diff --git a/include/zfs/uberblock_impl.h b/include/zfs/uberblock_impl.h new file mode 100644 index 0000000..806da95 --- /dev/null +++ b/include/zfs/uberblock_impl.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_UBERBLOCK_IMPL_H +#define _SYS_UBERBLOCK_IMPL_H + +#define UBMAX(a, b) ((a) > (b) ? (a) : (b)) + +/* + * The uberblock version is incremented whenever an incompatible on-disk + * format change is made to the SPA, DMU, or ZAP. + * + * Note: the first two fields should never be moved. When a storage pool + * is opened, the uberblock must be read off the disk before the version + * can be checked. If the ub_version field is moved, we may not detect + * version mismatch. If the ub_magic field is moved, applications that + * expect the magic number in the first word won't work. + */ +#define UBERBLOCK_MAGIC 0x00bab10c /* oo-ba-bloc! */ +#define UBERBLOCK_SHIFT 10 /* up to 1K */ + +typedef struct uberblock { + uint64_t ub_magic; /* UBERBLOCK_MAGIC */ + uint64_t ub_version; /* ZFS_VERSION */ + uint64_t ub_txg; /* txg of last sync */ + uint64_t ub_guid_sum; /* sum of all vdev guids */ + uint64_t ub_timestamp; /* UTC time of last sync */ + blkptr_t ub_rootbp; /* MOS objset_phys_t */ +} uberblock_t; + +#define VDEV_UBERBLOCK_SHIFT(as) UBMAX(as, UBERBLOCK_SHIFT) +#define UBERBLOCK_SIZE(as) (1ULL << VDEV_UBERBLOCK_SHIFT(as)) + +/* Number of uberblocks that can fit in the ring at a given ashift */ +#define UBERBLOCK_COUNT(as) (VDEV_UBERBLOCK_RING >> VDEV_UBERBLOCK_SHIFT(as)) + +#endif /* _SYS_UBERBLOCK_IMPL_H */ diff --git a/include/zfs/vdev_impl.h b/include/zfs/vdev_impl.h new file mode 100644 index 0000000..3aa646e --- /dev/null +++ b/include/zfs/vdev_impl.h @@ -0,0 +1,70 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_VDEV_IMPL_H +#define _SYS_VDEV_IMPL_H + +#define VDEV_SKIP_SIZE (8 << 10) +#define VDEV_BOOT_HEADER_SIZE (8 << 10) +#define VDEV_PHYS_SIZE (112 << 10) +#define VDEV_UBERBLOCK_RING (128 << 10) + +/* ZFS boot block */ +#define VDEV_BOOT_MAGIC 0x2f5b007b10cULL +#define VDEV_BOOT_VERSION 1 /* version number */ + +typedef struct vdev_boot_header { + uint64_t vb_magic; /* VDEV_BOOT_MAGIC */ + uint64_t vb_version; /* VDEV_BOOT_VERSION */ + uint64_t vb_offset; /* start offset (bytes) */ + uint64_t vb_size; /* size (bytes) */ + char vb_pad[VDEV_BOOT_HEADER_SIZE - 4 * sizeof(uint64_t)]; +} vdev_boot_header_t; + +typedef struct vdev_phys { + char vp_nvlist[VDEV_PHYS_SIZE - sizeof(zio_eck_t)]; + zio_eck_t vp_zbt; +} vdev_phys_t; + +typedef struct vdev_label { + char vl_pad[VDEV_SKIP_SIZE]; /* 8K */ + vdev_boot_header_t vl_boot_header; /* 8K */ + vdev_phys_t vl_vdev_phys; /* 112K */ + char vl_uberblock[VDEV_UBERBLOCK_RING]; /* 128K */ +} vdev_label_t; /* 256K total */ + +/* + * Size and offset of embedded boot loader region on each label. + * The total size of the first two labels plus the boot area is 4MB. + */ +#define VDEV_BOOT_OFFSET (2 * sizeof(vdev_label_t)) +#define VDEV_BOOT_SIZE (7ULL << 19) /* 3.5M */ + +/* + * Size of label regions at the start and end of each leaf device. + */ +#define VDEV_LABEL_START_SIZE (2 * sizeof(vdev_label_t) + VDEV_BOOT_SIZE) +#define VDEV_LABEL_END_SIZE (2 * sizeof(vdev_label_t)) +#define VDEV_LABELS 4 + +#endif /* _SYS_VDEV_IMPL_H */ diff --git a/include/zfs/zap_impl.h b/include/zfs/zap_impl.h new file mode 100644 index 0000000..5d24ab0 --- /dev/null +++ b/include/zfs/zap_impl.h @@ -0,0 +1,111 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright (c) 2008, 2011, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef _SYS_ZAP_IMPL_H +#define _SYS_ZAP_IMPL_H + +#define ZAP_MAGIC 0x2F52AB2ABULL + +#define ZAP_HASHBITS 28 +#define MZAP_ENT_LEN 64 +#define MZAP_NAME_LEN (MZAP_ENT_LEN - 8 - 4 - 2) +#define MZAP_MAX_BLKSHIFT SPA_MAXBLOCKSHIFT +#define MZAP_MAX_BLKSZ (1 << MZAP_MAX_BLKSHIFT) + +typedef struct mzap_ent_phys { + uint64_t mze_value; + uint32_t mze_cd; + uint16_t mze_pad; /* in case we want to chain them someday */ + char mze_name[MZAP_NAME_LEN]; +} mzap_ent_phys_t; + +typedef struct mzap_phys { + uint64_t mz_block_type; /* ZBT_MICRO */ + uint64_t mz_salt; + uint64_t mz_pad[6]; + mzap_ent_phys_t mz_chunk[1]; + /* actually variable size depending on block size */ +} mzap_phys_t; + +/* + * The (fat) zap is stored in one object. It is an array of + * 1<<FZAP_BLOCK_SHIFT byte blocks. The layout looks like one of: + * + * ptrtbl fits in first block: + * [zap_phys_t zap_ptrtbl_shift < 6] [zap_leaf_t] ... + * + * ptrtbl too big for first block: + * [zap_phys_t zap_ptrtbl_shift >= 6] [zap_leaf_t] [ptrtbl] ... + * + */ + +#define ZBT_LEAF ((1ULL << 63) + 0) +#define ZBT_HEADER ((1ULL << 63) + 1) +#define ZBT_MICRO ((1ULL << 63) + 3) +/* any other values are ptrtbl blocks */ + +/* + * the embedded pointer table takes up half a block: + * block size / entry size (2^3) / 2 + */ +#define ZAP_EMBEDDED_PTRTBL_SHIFT(zap) (FZAP_BLOCK_SHIFT(zap) - 3 - 1) + +/* + * The embedded pointer table starts half-way through the block. Since + * the pointer table itself is half the block, it starts at (64-bit) + * word number (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap)). + */ +#define ZAP_EMBEDDED_PTRTBL_ENT(zap, idx) \ + ((uint64_t *)(zap)->zap_f.zap_phys) \ + [(idx) + (1<<ZAP_EMBEDDED_PTRTBL_SHIFT(zap))] + +/* + * TAKE NOTE: + * If zap_phys_t is modified, zap_byteswap() must be modified. + */ +typedef struct zap_phys { + uint64_t zap_block_type; /* ZBT_HEADER */ + uint64_t zap_magic; /* ZAP_MAGIC */ + + struct zap_table_phys { + uint64_t zt_blk; /* starting block number */ + uint64_t zt_numblks; /* number of blocks */ + uint64_t zt_shift; /* bits to index it */ + uint64_t zt_nextblk; /* next (larger) copy start block */ + uint64_t zt_blks_copied; /* number source blocks copied */ + } zap_ptrtbl; + + uint64_t zap_freeblk; /* the next free block */ + uint64_t zap_num_leafs; /* number of leafs */ + uint64_t zap_num_entries; /* number of entries */ + uint64_t zap_salt; /* salt to stir into hash function */ + uint64_t zap_normflags; /* flags for u8_textprep_str() */ + uint64_t zap_flags; /* zap_flag_t */ + /* + * This structure is followed by padding, and then the embedded + * pointer table. The embedded pointer table takes up second + * half of the block. It is accessed using the + * ZAP_EMBEDDED_PTRTBL_ENT() macro. + */ +} zap_phys_t; + +#endif /* _SYS_ZAP_IMPL_H */ diff --git a/include/zfs/zap_leaf.h b/include/zfs/zap_leaf.h new file mode 100644 index 0000000..ed7ff62 --- /dev/null +++ b/include/zfs/zap_leaf.h @@ -0,0 +1,103 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZAP_LEAF_H +#define _SYS_ZAP_LEAF_H + +#define ZAP_LEAF_MAGIC 0x2AB1EAF + +/* chunk size = 24 bytes */ +#define ZAP_LEAF_CHUNKSIZE 24 + +/* + * The amount of space within the chunk available for the array is: + * chunk size - space for type (1) - space for next pointer (2) + */ +#define ZAP_LEAF_ARRAY_BYTES (ZAP_LEAF_CHUNKSIZE - 3) + +typedef enum zap_chunk_type { + ZAP_CHUNK_FREE = 253, + ZAP_CHUNK_ENTRY = 252, + ZAP_CHUNK_ARRAY = 251, + ZAP_CHUNK_TYPE_MAX = 250 +} zap_chunk_type_t; + +/* + * TAKE NOTE: + * If zap_leaf_phys_t is modified, zap_leaf_byteswap() must be modified. + */ +typedef struct zap_leaf_phys { + struct zap_leaf_header { + uint64_t lh_block_type; /* ZBT_LEAF */ + uint64_t lh_pad1; + uint64_t lh_prefix; /* hash prefix of this leaf */ + uint32_t lh_magic; /* ZAP_LEAF_MAGIC */ + uint16_t lh_nfree; /* number free chunks */ + uint16_t lh_nentries; /* number of entries */ + uint16_t lh_prefix_len; /* num bits used to id this */ + + /* above is accessable to zap, below is zap_leaf private */ + + uint16_t lh_freelist; /* chunk head of free list */ + uint8_t lh_pad2[12]; + } l_hdr; /* 2 24-byte chunks */ + + /* + * The header is followed by a hash table with + * ZAP_LEAF_HASH_NUMENTRIES(zap) entries. The hash table is + * followed by an array of ZAP_LEAF_NUMCHUNKS(zap) + * zap_leaf_chunk structures. These structures are accessed + * with the ZAP_LEAF_CHUNK() macro. + */ + + uint16_t l_hash[1]; +} zap_leaf_phys_t; + +typedef union zap_leaf_chunk { + struct zap_leaf_entry { + uint8_t le_type; /* always ZAP_CHUNK_ENTRY */ + uint8_t le_int_size; /* size of ints */ + uint16_t le_next; /* next entry in hash chain */ + uint16_t le_name_chunk; /* first chunk of the name */ + uint16_t le_name_length; /* bytes in name, incl null */ + uint16_t le_value_chunk; /* first chunk of the value */ + uint16_t le_value_length; /* value length in ints */ + uint32_t le_cd; /* collision differentiator */ + uint64_t le_hash; /* hash value of the name */ + } l_entry; + struct zap_leaf_array { + uint8_t la_type; /* always ZAP_CHUNK_ARRAY */ + union { + uint8_t la_array[ZAP_LEAF_ARRAY_BYTES]; + uint64_t la_array64; + } __attribute__ ((packed)); + uint16_t la_next; /* next blk or CHAIN_END */ + } l_array; + struct zap_leaf_free { + uint8_t lf_type; /* always ZAP_CHUNK_FREE */ + uint8_t lf_pad[ZAP_LEAF_ARRAY_BYTES]; + uint16_t lf_next; /* next in free list, or CHAIN_END */ + } l_free; +} zap_leaf_chunk_t; + +#endif /* _SYS_ZAP_LEAF_H */ diff --git a/include/zfs/zfs.h b/include/zfs/zfs.h new file mode 100644 index 0000000..319ee6a --- /dev/null +++ b/include/zfs/zfs.h @@ -0,0 +1,122 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright (c) 2007, 2011, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef GRUB_ZFS_HEADER +#define GRUB_ZFS_HEADER 1 + + +/* + * On-disk version number. + */ +#define SPA_VERSION 28ULL + +/* + * The following are configuration names used in the nvlist describing a pool's + * configuration. + */ +#define ZPOOL_CONFIG_VERSION "version" +#define ZPOOL_CONFIG_POOL_NAME "name" +#define ZPOOL_CONFIG_POOL_STATE "state" +#define ZPOOL_CONFIG_POOL_TXG "txg" +#define ZPOOL_CONFIG_POOL_GUID "pool_guid" +#define ZPOOL_CONFIG_CREATE_TXG "create_txg" +#define ZPOOL_CONFIG_TOP_GUID "top_guid" +#define ZPOOL_CONFIG_VDEV_TREE "vdev_tree" +#define ZPOOL_CONFIG_TYPE "type" +#define ZPOOL_CONFIG_CHILDREN "children" +#define ZPOOL_CONFIG_ID "id" +#define ZPOOL_CONFIG_GUID "guid" +#define ZPOOL_CONFIG_PATH "path" +#define ZPOOL_CONFIG_DEVID "devid" +#define ZPOOL_CONFIG_METASLAB_ARRAY "metaslab_array" +#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift" +#define ZPOOL_CONFIG_ASHIFT "ashift" +#define ZPOOL_CONFIG_ASIZE "asize" +#define ZPOOL_CONFIG_DTL "DTL" +#define ZPOOL_CONFIG_STATS "stats" +#define ZPOOL_CONFIG_WHOLE_DISK "whole_disk" +#define ZPOOL_CONFIG_ERRCOUNT "error_count" +#define ZPOOL_CONFIG_NOT_PRESENT "not_present" +#define ZPOOL_CONFIG_SPARES "spares" +#define ZPOOL_CONFIG_IS_SPARE "is_spare" +#define ZPOOL_CONFIG_NPARITY "nparity" +#define ZPOOL_CONFIG_PHYS_PATH "phys_path" +#define ZPOOL_CONFIG_L2CACHE "l2cache" +#define ZPOOL_CONFIG_HOLE_ARRAY "hole_array" +#define ZPOOL_CONFIG_VDEV_CHILDREN "vdev_children" +#define ZPOOL_CONFIG_IS_HOLE "is_hole" +#define ZPOOL_CONFIG_DDT_HISTOGRAM "ddt_histogram" +#define ZPOOL_CONFIG_DDT_OBJ_STATS "ddt_object_stats" +#define ZPOOL_CONFIG_DDT_STATS "ddt_stats" +/* + * The persistent vdev state is stored as separate values rather than a single + * 'vdev_state' entry. This is because a device can be in multiple states, such + * as offline and degraded. + */ +#define ZPOOL_CONFIG_OFFLINE "offline" +#define ZPOOL_CONFIG_FAULTED "faulted" +#define ZPOOL_CONFIG_DEGRADED "degraded" +#define ZPOOL_CONFIG_REMOVED "removed" + +#define VDEV_TYPE_ROOT "root" +#define VDEV_TYPE_MIRROR "mirror" +#define VDEV_TYPE_REPLACING "replacing" +#define VDEV_TYPE_RAIDZ "raidz" +#define VDEV_TYPE_DISK "disk" +#define VDEV_TYPE_FILE "file" +#define VDEV_TYPE_MISSING "missing" +#define VDEV_TYPE_HOLE "hole" +#define VDEV_TYPE_SPARE "spare" +#define VDEV_TYPE_L2CACHE "l2cache" + +/* + * pool state. The following states are written to disk as part of the normal + * SPA lifecycle: ACTIVE, EXPORTED, DESTROYED, SPARE, L2CACHE. The remaining + * states are software abstractions used at various levels to communicate pool + * state. + */ +typedef enum pool_state { + POOL_STATE_ACTIVE = 0, /* In active use */ + POOL_STATE_EXPORTED, /* Explicitly exported */ + POOL_STATE_DESTROYED, /* Explicitly destroyed */ + POOL_STATE_SPARE, /* Reserved for hot spare use */ + POOL_STATE_L2CACHE, /* Level 2 ARC device */ + POOL_STATE_UNINITIALIZED, /* Internal spa_t state */ + POOL_STATE_UNAVAIL, /* Internal libzfs state */ + POOL_STATE_POTENTIALLY_ACTIVE /* Internal libzfs state */ +} pool_state_t; + +struct zfs_data; + +int zfs_fetch_nvlist(device_t dev, char **nvlist); +int zfs_getmdnobj(device_t dev, const char *fsfilename, + uint64_t *mdnobj); + +char *zfs_nvlist_lookup_string(char *nvlist, char *name); +char *zfs_nvlist_lookup_nvlist(char *nvlist, char *name); +int zfs_nvlist_lookup_uint64(char *nvlist, char *name, + uint64_t *out); +char *zfs_nvlist_lookup_nvlist_array(char *nvlist, char *name, + size_t index); +int zfs_nvlist_lookup_nvlist_array_get_nelm(char *nvlist, char *name); + +#endif /* ! GRUB_ZFS_HEADER */ diff --git a/include/zfs/zfs_acl.h b/include/zfs/zfs_acl.h new file mode 100644 index 0000000..1b3e14f --- /dev/null +++ b/include/zfs/zfs_acl.h @@ -0,0 +1,55 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2007 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ACL_H +#define _SYS_FS_ZFS_ACL_H + +typedef struct zfs_oldace { + uint32_t z_fuid; /* "who" */ + uint32_t z_access_mask; /* access mask */ + uint16_t z_flags; /* flags, i.e inheritance */ + uint16_t z_type; /* type of entry allow/deny */ +} zfs_oldace_t; + +#define ACE_SLOT_CNT 6 + +typedef struct zfs_znode_acl_v0 { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_count; /* Number of ACEs */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_pad; /* pad */ + zfs_oldace_t z_ace_data[ACE_SLOT_CNT]; /* 6 standard ACEs */ +} zfs_znode_acl_v0_t; + +#define ZFS_ACE_SPACE (sizeof(zfs_oldace_t) * ACE_SLOT_CNT) + +typedef struct zfs_znode_acl { + uint64_t z_acl_extern_obj; /* ext acl pieces */ + uint32_t z_acl_size; /* Number of bytes in ACL */ + uint16_t z_acl_version; /* acl version */ + uint16_t z_acl_count; /* ace count */ + uint8_t z_ace_data[ZFS_ACE_SPACE]; /* space for embedded ACEs */ +} zfs_znode_acl_t; + + +#endif /* _SYS_FS_ZFS_ACL_H */ diff --git a/include/zfs/zfs_znode.h b/include/zfs/zfs_znode.h new file mode 100644 index 0000000..2267aca --- /dev/null +++ b/include/zfs/zfs_znode.h @@ -0,0 +1,71 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_FS_ZFS_ZNODE_H +#define _SYS_FS_ZFS_ZNODE_H + +#include <zfs/zfs_acl.h> + +#define MASTER_NODE_OBJ 1 +#define ZFS_ROOT_OBJ "ROOT" +#define ZPL_VERSION_STR "VERSION" +#define ZFS_SA_ATTRS "SA_ATTRS" + +#define ZPL_VERSION 5ULL + +#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48) + +/* + * This is the persistent portion of the znode. It is stored + * in the "bonus buffer" of the file. Short symbolic links + * are also stored in the bonus buffer. + */ +typedef struct znode_phys { + uint64_t zp_atime[2]; /* 0 - last file access time */ + uint64_t zp_mtime[2]; /* 16 - last file modification time */ + uint64_t zp_ctime[2]; /* 32 - last file change time */ + uint64_t zp_crtime[2]; /* 48 - creation time */ + uint64_t zp_gen; /* 64 - generation (txg of creation) */ + uint64_t zp_mode; /* 72 - file mode bits */ + uint64_t zp_size; /* 80 - size of file */ + uint64_t zp_parent; /* 88 - directory parent (`..') */ + uint64_t zp_links; /* 96 - number of links to file */ + uint64_t zp_xattr; /* 104 - DMU object for xattrs */ + uint64_t zp_rdev; /* 112 - dev_t for VBLK & VCHR files */ + uint64_t zp_flags; /* 120 - persistent flags */ + uint64_t zp_uid; /* 128 - file owner */ + uint64_t zp_gid; /* 136 - owning group */ + uint64_t zp_pad[4]; /* 144 - future */ + zfs_znode_acl_t zp_acl; /* 176 - 263 ACL */ + /* + * Data may pad out any remaining bytes in the znode buffer, eg: + * + * |<---------------------- dnode_phys (512) ------------------------>| + * |<-- dnode (192) --->|<----------- "bonus" buffer (320) ---------->| + * |<---- znode (264) ---->|<---- data (56) ---->| + * + * At present, we only use this space to store symbolic links. + */ +} znode_phys_t; + +#endif /* _SYS_FS_ZFS_ZNODE_H */ diff --git a/include/zfs/zil.h b/include/zfs/zil.h new file mode 100644 index 0000000..036573a --- /dev/null +++ b/include/zfs/zil.h @@ -0,0 +1,57 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2009 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIL_H +#define _SYS_ZIL_H + +/* + * Intent log format: + * + * Each objset has its own intent log. The log header (zil_header_t) + * for objset N's intent log is kept in the Nth object of the SPA's + * intent_log objset. The log header points to a chain of log blocks, + * each of which contains log records (i.e., transactions) followed by + * a log block trailer (zil_trailer_t). The format of a log record + * depends on the record (or transaction) type, but all records begin + * with a common structure that defines the type, length, and txg. + */ + +/* + * Intent log header - this on disk structure holds fields to manage + * the log. All fields are 64 bit to easily handle cross architectures. + */ +typedef struct zil_header { + uint64_t zh_claim_txg; /* txg in which log blocks were claimed */ + uint64_t zh_replay_seq; /* highest replayed sequence number */ + blkptr_t zh_log; /* log chain */ + uint64_t zh_claim_seq; /* highest claimed sequence number */ + uint64_t zh_flags; /* header flags */ + uint64_t zh_pad[4]; +} zil_header_t; + +/* + * zh_flags bit settings + */ +#define ZIL_REPLAY_NEEDED 0x1 /* replay needed - internal only */ + +#endif /* _SYS_ZIL_H */ diff --git a/include/zfs/zio.h b/include/zfs/zio.h new file mode 100644 index 0000000..4adb14c --- /dev/null +++ b/include/zfs/zio.h @@ -0,0 +1,92 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved. + */ + +#ifndef _ZIO_H +#define _ZIO_H + +#include <zfs/spa.h> + +#define ZEC_MAGIC 0x210da7ab10c7a11ULL /* zio data bloc tail */ + +typedef struct zio_eck { + uint64_t zec_magic; /* for validation, endianness */ + zio_cksum_t zec_cksum; /* 256-bit checksum */ +} zio_eck_t; + +/* + * Gang block headers are self-checksumming and contain an array + * of block pointers. + */ +#define SPA_GANGBLOCKSIZE SPA_MINBLOCKSIZE +#define SPA_GBH_NBLKPTRS ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t)) / sizeof(blkptr_t)) +#define SPA_GBH_FILLER ((SPA_GANGBLOCKSIZE - \ + sizeof(zio_eck_t) - \ + (SPA_GBH_NBLKPTRS * sizeof(blkptr_t))) /\ + sizeof(uint64_t)) + +#define ZIO_GET_IOSIZE(zio) \ + (BP_IS_GANG((zio)->io_bp) ? \ + SPA_GANGBLOCKSIZE : BP_GET_PSIZE((zio)->io_bp)) + +typedef struct zio_gbh { + blkptr_t zg_blkptr[SPA_GBH_NBLKPTRS]; + uint64_t zg_filler[SPA_GBH_FILLER]; + zio_eck_t zg_tail; +} zio_gbh_phys_t; + +enum zio_checksum { + ZIO_CHECKSUM_INHERIT = 0, + ZIO_CHECKSUM_ON, + ZIO_CHECKSUM_OFF, + ZIO_CHECKSUM_LABEL, + ZIO_CHECKSUM_GANG_HEADER, + ZIO_CHECKSUM_ZILOG, + ZIO_CHECKSUM_FLETCHER_2, + ZIO_CHECKSUM_FLETCHER_4, + ZIO_CHECKSUM_SHA256, + ZIO_CHECKSUM_ZILOG2, + ZIO_CHECKSUM_FUNCTIONS +}; + +#define ZIO_CHECKSUM_ON_VALUE ZIO_CHECKSUM_FLETCHER_2 +#define ZIO_CHECKSUM_DEFAULT ZIO_CHECKSUM_ON + +enum zio_compress { + ZIO_COMPRESS_INHERIT = 0, + ZIO_COMPRESS_ON, + ZIO_COMPRESS_OFF, + ZIO_COMPRESS_LZJB, + ZIO_COMPRESS_EMPTY, + ZIO_COMPRESS_GZIP1, + ZIO_COMPRESS_GZIP2, + ZIO_COMPRESS_GZIP3, + ZIO_COMPRESS_GZIP4, + ZIO_COMPRESS_GZIP5, + ZIO_COMPRESS_GZIP6, + ZIO_COMPRESS_GZIP7, + ZIO_COMPRESS_GZIP8, + ZIO_COMPRESS_GZIP9, + ZIO_COMPRESS_FUNCTIONS +}; + +#endif /* _ZIO_H */ diff --git a/include/zfs/zio_checksum.h b/include/zfs/zio_checksum.h new file mode 100644 index 0000000..0d5fce6 --- /dev/null +++ b/include/zfs/zio_checksum.h @@ -0,0 +1,50 @@ +/* + * GRUB -- GRand Unified Bootloader + * Copyright (C) 1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +/* + * Copyright 2010 Sun Microsystems, Inc. All rights reserved. + * Use is subject to license terms. + */ + +#ifndef _SYS_ZIO_CHECKSUM_H +#define _SYS_ZIO_CHECKSUM_H + +/* + * Signature for checksum functions. + */ +typedef void zio_checksum_t(const void *data, uint64_t size, + zfs_endian_t endian, zio_cksum_t *zcp); + +/* + * Information about each checksum function. + */ +typedef struct zio_checksum_info { + zio_checksum_t *ci_func; /* checksum function for each byteorder */ + int ci_correctable; /* number of correctable bits */ + int ci_eck; /* uses zio embedded checksum? */ + char *ci_name; /* descriptive name */ +} zio_checksum_info_t; + +extern void zio_checksum_SHA256(const void *, uint64_t, + zfs_endian_t endian, zio_cksum_t *); +extern void fletcher_2_endian(const void *, uint64_t, zfs_endian_t endian, + zio_cksum_t *); +extern void fletcher_4_endian(const void *, uint64_t, zfs_endian_t endian, + zio_cksum_t *); + +#endif /* _SYS_ZIO_CHECKSUM_H */ diff --git a/include/zfs_common.h b/include/zfs_common.h new file mode 100644 index 0000000..04e73d0 --- /dev/null +++ b/include/zfs_common.h @@ -0,0 +1,109 @@ +/* + * ZFS filesystem port for Uboot by + * Jorgen Lundman <lundman at lundman.net> + * + * zfsfs support + * made from existing GRUB Sources by Sun, GNU and others. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef __ZFS_COMMON__ +#define __ZFS_COMMON__ + +#define SECTOR_SIZE 0x200 +#define SECTOR_BITS 9 + + +typedef enum zfs_endian { + UNKNOWN_ENDIAN = -2, + LITTLE_ENDIAN = -1, + BIG_ENDIAN = 0 +} zfs_endian_t; + + +/* Endian macros. */ +#define zfs_to_cpu16(x, a) (((a) == BIG_ENDIAN) ? be16_to_cpu(x) \ + : le16_to_cpu(x)) +#define cpu_to_zfs16(x, a) (((a) == BIG_ENDIAN) ? cpu_to_be16(x) \ + : cpu_to_le16(x)) + +#define zfs_to_cpu32(x, a) (((a) == BIG_ENDIAN) ? be32_to_cpu(x) \ + : le32_to_cpu(x)) +#define cpu_to_zfs32(x, a) (((a) == BIG_ENDIAN) ? cpu_to_be32(x) \ + : cpu_to_le32(x)) + +#define zfs_to_cpu64(x, a) (((a) == BIG_ENDIAN) ? be64_to_cpu(x) \ + : le64_to_cpu(x)) +#define cpu_to_zfs64(x, a) (((a) == BIG_ENDIAN) ? cpu_to_be64(x) \ + : cpu_to_le64(x)) + + +enum zfs_errors { + ZFS_ERR_NONE = 0, + ZFS_ERR_NOT_IMPLEMENTED_YET = -1, + ZFS_ERR_BAD_FS = -2, + ZFS_ERR_OUT_OF_MEMORY = -3, + ZFS_ERR_FILE_NOT_FOUND = -4, + ZFS_ERR_BAD_FILE_TYPE = -5, + ZFS_ERR_OUT_OF_RANGE = -6, +}; + +struct zfs_filesystem { + + /* Block Device Descriptor */ + block_dev_desc_t *dev_desc; +}; + + +extern block_dev_desc_t *zfs_dev_desc; + +struct device_s { + uint64_t part_length; +}; +typedef struct device_s *device_t; + +struct zfs_file { + device_t device; + uint64_t size; + void *data; + uint64_t offset; +}; + +typedef struct zfs_file *zfs_file_t; + +struct zfs_dirhook_info { + int dir; + int mtimeset; + time_t mtime; + time_t mtime2; +}; + + + + +struct zfs_filesystem *zfsget_fs(void); +int init_fs(block_dev_desc_t *dev_desc); +void deinit_fs(block_dev_desc_t *dev_desc); +int zfs_open(zfs_file_t, const char *filename); +uint64_t zfs_read(zfs_file_t, char *buf, uint64_t len); +struct zfs_data *zfs_mount(device_t); +int zfs_close(zfs_file_t); +int zfs_ls(device_t dev, const char *path, + int (*hook) (const char *, const struct zfs_dirhook_info *)); +int zfs_devread(int sector, int byte_offset, int byte_len, char *buf); +int zfs_set_blk_dev(block_dev_desc_t *rbdd, int part); +void zfs_unmount(struct zfs_data *data); +int lzjb_decompress(void *, void *, uint32_t, uint32_t); +#endif

Dear Jorgen Lundman,
In message 1342766905-1275-2-git-send-email-lundman@lundman.net you wrote:
U-Boot port is based on sources forked from GRUB-0.97 by Sun in 2004, which can be found here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/grub/grub-0.97...
Released by Sun for GRUB under the license:
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
GRUB official releases include ZFS in version: ftp://alpha.gnu.org/gnu/grub/grub-1.99~rc1.tar.gz
And patched against GRUB Bazaar repository for ashift fixes (4KB HDDs) more conveniently found at github: https://github.com/pendor/grub-zfs/commit/e7b6ef3ac3b9685ac4c394c897b1d4221b...
Signed-off-by: Jorgen Lundman lundman@lundman.net
v5: * Re-port based on GPLv2 license files, from original Sun GRUB-0.97 and patch forward. Headers remained untouched, minor style changes in some function. No logic changes required.
v4: * Add doc/README.zfs documentation
v3: * add missing patch revision history (this text) * Submitted as single patch per Wolfgang Denk instructions
v2: * Keep Makefile placement alphabetically sorted. * Clean ugly line breaks and indentation errors * Fix license corruption in fs/Makefile
Makefile | 2 +- common/Makefile | 1 + common/cmd_zfs.c | 236 +++++ doc/README.zfs | 30 + fs/Makefile | 1 + fs/{ => zfs}/Makefile | 39 +- fs/zfs/dev.c | 137 +++ fs/zfs/zfs.c | 2396 ++++++++++++++++++++++++++++++++++++++++++ fs/zfs/zfs_fletcher.c | 88 ++ fs/zfs/zfs_lzjb.c | 97 ++ fs/zfs/zfs_sha256.c | 148 +++ include/config_cmd_all.h | 1 + include/zfs/dmu.h | 120 +++ include/zfs/dmu_objset.h | 43 + include/zfs/dnode.h | 81 ++ include/zfs/dsl_dataset.h | 53 + include/zfs/dsl_dir.h | 49 + include/zfs/sa_impl.h | 35 + include/zfs/spa.h | 292 +++++ include/zfs/uberblock_impl.h | 57 + include/zfs/vdev_impl.h | 70 ++ include/zfs/zap_impl.h | 111 ++ include/zfs/zap_leaf.h | 103 ++ include/zfs/zfs.h | 122 +++ include/zfs/zfs_acl.h | 55 + include/zfs/zfs_znode.h | 71 ++ include/zfs/zil.h | 57 + include/zfs/zio.h | 92 ++ include/zfs/zio_checksum.h | 50 + include/zfs_common.h | 109 ++ 30 files changed, 4730 insertions(+), 16 deletions(-) create mode 100644 common/cmd_zfs.c create mode 100644 doc/README.zfs copy fs/{ => zfs}/Makefile (56%) create mode 100644 fs/zfs/dev.c create mode 100644 fs/zfs/zfs.c create mode 100644 fs/zfs/zfs_fletcher.c create mode 100644 fs/zfs/zfs_lzjb.c create mode 100644 fs/zfs/zfs_sha256.c create mode 100644 include/zfs/dmu.h create mode 100644 include/zfs/dmu_objset.h create mode 100644 include/zfs/dnode.h create mode 100644 include/zfs/dsl_dataset.h create mode 100644 include/zfs/dsl_dir.h create mode 100644 include/zfs/sa_impl.h create mode 100644 include/zfs/spa.h create mode 100644 include/zfs/uberblock_impl.h create mode 100644 include/zfs/vdev_impl.h create mode 100644 include/zfs/zap_impl.h create mode 100644 include/zfs/zap_leaf.h create mode 100644 include/zfs/zfs.h create mode 100644 include/zfs/zfs_acl.h create mode 100644 include/zfs/zfs_znode.h create mode 100644 include/zfs/zil.h create mode 100644 include/zfs/zio.h create mode 100644 include/zfs/zio_checksum.h create mode 100644 include/zfs_common.h
Applied, thanks.
Best regards,
Wolfgang Denk

Applied, thanks.
Best regards,
That is the best news ever, thanks! Also thanks to Graeme for all the help.
Lund

Dear Jorgen Lundman,
In message 5024F981.2040207@lundman.net you wrote:
That is the best news ever, thanks! Also thanks to Graeme for all the help.
Thanks for your hard work, it's highly appreciated (even though it sometimes doesn't look like that - it really is).
Best regards,
Wolfgang Denk
participants (5)
-
Graeme Russ
-
Jorgen Lundman
-
Mike Frysinger
-
Prabhakar Lad
-
Wolfgang Denk